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ABSTRACT 



An enumeration algoritnm wnicn synthesizes programs from 
example computations is presented. Tne algorithm, originally 
proposed by Alan 'if. Biermann of Cuice University, assigns a 
labelling of tne instructions contained in an example trace 
consistent with producing minimum state Moore macnine 
representations for tne syntnesized programs. Tecnniques for 
processing tne information to reduce enumeration are given. 
Biermann's algoritnm is extended by trace preprocessing 
techniques wnicn identify and generalize conditions on 
instruction sequencing in tne syntnesized programs witnout 
tne user's assistance. Tne tecnniques are presented using 
text editing as the domain, but are general enough to be 
extendable into other domains. 
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I . INTRODUCTION 



A . BACKGROUND 

Since the introduction of electronic computing machines, 
manual tastes that are mundane, tedious and/or repetitious 
nave been considered for automation. The computer is ideally 
suited for this type wo rlt since it neither complains of 
boredom nor wanders from its assigned taste. Tne macnine 
meticulously sequences through a series of computations over 
and over, producing answers consistent within the 
limitations of the hardware. As consistent as the computer 
is at performing tastes, assigning tne tastes is still left to 
the user of the system. 

Programming tne early machines was a difficult chore. 
Communications between man and machine were only 
accomplishable through tne language of tne machine. This 
macnine language consisted of binary coded macnine 
operations. Tne efficient macnine language programmer had to 
memorize tnese codes or xeep a list of tne codes close by. 
All control transfer points had to be coded in absolute 
macnine addresses wnicn tne programmer calculated Dy hand. A 
prosrammmer had to interpret the binary representation of 
the machine operations to determine the cause of errors in 
programs. There were no diagnostic messaees to aid tne user 
in isolating errors. The difficulty of programming in 
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machine language led to a searcn to tin! tetter ways of 
generating programs. Toe first step was toe recognition tnat 
toe computer was a good boosceeeper, capable of computing 
absolute addresses from labels and translating mnemonic 
representations of macnine operation codes. Webster's New 
Word Dictionary, Second Edition, defines mnemonic to be, "a 
system or technique of improving memory by toe use of 
certain formulas.” Soon programs were written wnicn would 
accept abstract programs containing mnemonics and labels, 
convert tne mnemonics into macnine operation codes and 
translate toe labels into absolute macnine addresses. These 
programs produced executable macnine language code as 
output. These translation programs were called assemblers 
and tne data tney translated were called assembly language 
programs . 

Assembly language provided some automation of tne manual 
tastes associated with macnine language programming. An 
important convenience of assembly language is toe 
readability of the programs wnen compared to macnine 
language programs. Tne mnencmics convey tne meaning of their 
function while tne labels relieved tne programmer of 
calculating absolute addresses for control transfer points. 
Assembly language provided a level of abstraction wnicn 
allowed programmers to concentrate on tne programming 
problem witnout dealing with every atomic macnine operation. 
Tne assembler provided bootteeping, address translation and 
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mneumonlc decoding fast and efficiently. Programmers were 
now capable of producing more code in less time witn fewer 
errors witn assembly language. 

Assembly language eased tne programmers tass but it 
still could not be considered a panacea for computer-human 
Interaction. Assembly language still required tne programmer 
to maintain control over many macnine operations and ne nad 
to provide tne logic to control tne flow of program 
execution. Tne instructions used to perform control 
functions appears as similar code fragments in most programs 
written in assembly language. Tnese code fragments performed 
fuctions sucn as controlling brancning decisions and Keeping 
count of loop indices. When it was observed tnat common code 
fragments appeared across a wide range of assembly programs, 
it was recognized tnat tnese code fragments could be 
represented as a single instruction and tne computer could 
translate tne single instruction into tne code fragment it 
represented. Tne programs tnat translate tnese complex 
instructions are called compilers or interpeters. Tne 
complied or interpeted languages tnat followed assembly 
language in tnis evolutionary process incorporated tne 
program fragments as a single instruction for tne language. 
Constructs sucn as FOR, DO WHILE and IF THEN are examples cf 
nigner level control structure implementation. 

FORTRAN was the first in a long line of nigner level 
languages. FORTRAN differed from tne others by becoming 
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endeared to a family of users and the language endures today 
as one of tne most frequently used higner level languages. 
’Vhat Qualities of tne language produced tnis popularity? 

The FORTRAN language is attributed to John Backus. His 
primary goal wnen designing tne language was to mane tne 
language resemble the notation used in nign school algebra. 
Since tne notation used in nign scnool algebra was familiar 
to a wide audience, FORTRAN gave a friendly appearance. The 
language's apparent simplicity is tne endearing quality of 
FORTRAN. Some other language implementors failed to 
recognize tnis point and their languages never received wide 
acceptance. ALOOL is an example of a powerful language tnat 
never received the acceptance anticipated. 

Other programming languages that followed added compact 
representation of other recurring program fragments. Tne 
higher level constructs were not limited to control 
structures but also included constructs for data 
manipulation functions. Iverson's [l J APL (A Programming 
Language) provided powerful operators capable of performing 
complex functions such as matrix multiplication in one 
instruction. 

This trend continues today, ''lany of the newer languages 
implement sophisticated and powerful operators and control 
structures. Some of these languages are for a select segment 
of computer users, intended for application to a particular 
domain. The users are expected to be familiar with the 
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domain, so tne form of tne language snould te familiar to 
the user also. A problem with a domain specific language is 
its inability to adapt to otner areas. To worn in anotner 
area the user must become familiar with another language. A 
phenomenon demonstrated by many computer users is a 
reluctance to adapt themselves and learn a new language tnat 
may be more appropriate for a given task. Either they break 
tne egg with a sledge hammer or dig tne well witn a spoon. 
When required to use a new language, the user will likely 
use only a small subset of tne language tnat is capable of 
doing the job. Worst than using only a subset of tne 
language features is the tendency to bring old programming 
styles applicable to tne old language into tne new language. 
The point that is to be made is that learning a new 
programming language is a nard chore and is avoided wnenever 
possible. 

Anotner direction wnicn tne automation of programming 
tasks has taken is the development of a programming 
environment . A programming environment automates some of tne 
manual chores by providing the user with aids that assist 
him in constructing programs. The environment includes a 
programming language, an interactive syntax-directed editor 
and an on-line debugger. The editor provides syntax error 
diagnostics while tne programmer is creating tne source 
file. The programmer is forced to correct the syntax error 
immediately before tne editor will allow aim to continue 
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proeramTii n? . The error should be readily apparent to the 
programmer because it is in tne latest input. The on-line 
debueeer allows the programmer to actively test his program, 
halt execution, cnecfc the value of variaDles, change the 
value of variables or cnange tne code itself. Program 
environment systems may even allow the programmer to swi ten 
from tne tne editor to tne on-line debugger and bacic at ar.y 
time. A proerammin^ environment can be summarized as a 
friendly interface utilizing an intelligent editor which can 
recognize syntax errors in the associated programming 
language and one that contains otner interactive programming 
tools. 

Programming has been called an art form requiring 
intellectual creativity. Tne automation of intellectual 
behavior is a field of study within Computer Science called 
Artificial Intelligence. Tne study of the automation of 
programming tasts which require human-liue reasoning is 
called Program Synthesis or Automatic Programming. It is not 
our intention to provide a definition of intelligent 
behavior for a macnine since there is considerable 
disagreement even among tne experts. However, we note that 
tne goal of research in automatic programming is tne same 
goal that led to all the advances in programming laneuasres. 
Informally, this goal is to mate tne interaction between man 
and computer as painless as possible. That is, painless for 
tne man but not necessarily for tne computer. Dijicstra [2J 
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objects to our automation of programming by claiming, "tfe 
should not automate programming even if we can, oecause it 
would taKe away our enjoyment of the tasK." Ve note mere 
are those wno may require the use of computer services that 
nave neither tne time nor inclination to obtain the required 
education to do that chore. These include professions such 
as lawyers, pnysicians, and even tneoreticai physicists. We 
assume, if programming becomes fully automated, the 
programmers will then turn their attention toward other 
creative and stimulating pursuits. R. Hamming nas said, "The 
purpose of computing is insight not numbers." 

Many on-going efforts are aimed at providing better 
systems for the user so he may create programs faster, with 
less errors and with less effort. Tne history of programming 
language development has shown that automation of many 
programming tasics is feasible. How much more of the 
programming tasss can be automated? What would be considered 
the ultimate system for producing computer programs? 

B. AUTOMATIC PROGRAMMING 

1 . Gene ral 

Program synthesis or automat ic programming is a 
research topic concerned with tne development of systems 
that provide more and more automation of the programming 
process, particularly tnose tasics requiring human-lifce 
reasoning. Tne goal is not to create systems that program 
themselves, but to create systems which can construct, under 
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the direction of a user, programs tna t can perform some 
function ne desires. Tnese systems must be easy to use, easy 
to learn, and increase the eificiency of tne user. Tne users 
of tnese systems will no longer De restricted to tne few 
computer professionals, but will include otner professional 
fields as well as non-professionals. Automatic programming 
systems are to interact wi tn tne user, recognize 
requirements, and then syntnesize a correct program tnat 
satisfies tne requirements. 

Two questions arise in tne researcn on automatic 
programming. First, wnat is tne form of tne interaction 
between tne user and tne system? Tnis question is called tne 
specification problem because it is concerned witn issues 
relating to now tne user is to inform tne system of nis 
requirements. Tne second question is, given a specification 
metnod, wnat syntnesis tecnnique is available to be applied 
tnat will transform tne specification into an appropriate 
program. Tne tecnnique used for syntnesis is often dependent 
upon tne form of tne problem specification and most of tne 
projects involving automatic programming consider botn 
problems togetner. It nas been proposed by Green L3J tnat 
the two questions should be separated witn research 
proceeding concurrently on botn problems. He proposes tnere 
is a standard intermediate representation of tne problem 
specification which would permit interaction between tne two 
pro blems . 
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Four tecnniques nave Deen proposed for tne 
specification proolem wnicn dominate tne literature on 
automatic programming. Sacn of tne proposed techniques of 
problem specification introduce a different approacn to tne 
syntnesis problem. Tne four specification tecnniques can be 
categorized as follows: 

1. Natural Language. 

2. Formal Problem Specification. 

3. Input-output Pairs. 

4. Example Computations. 

fiacn of tbese specification tecnniques will be dicussed in 
tne following subsectioas and tne relationsnip to a 
syntnesis approacn will be discussed. 

2. Problem Specification witn Natural Language 

A visionary approacn to tne specification proolem is 
tne use of natural language. Natural language provides a 
fast, comfortable metnod of communication wnicn is already 
understood by numans. Implementation of a natural language 
understanding system nas proven to be a very difficult 
problem (Class [ 4 ] ) . 

Two forms of natural language are tne spoicen form 
and tne written form. Understanding spoiten language 
increases tne degree of difficulty because tne communication 
is in tne form of audio waves. Once the audio input is 
captured, it must be converted into another form for further 
syntactic and semantic analysis. The reader will note tnat 
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once the audio input has been captured anc converted the 



problem of written and spoicen language becomes tne same. 
That is, tne internal representation of the spofcen and 
written word can be tne same and tne problem becomes one of 
inferring meaning from tne representation. Future advances 
in voice understanding nardware can be expected and tnese 
advances may be expected to find their way into use. 

A complete natural language understanding system 
would be expected to be able to understand all grammatically 
correct sentences. However, natural languages do net nave 
finite grammars. This complexity implies a complete 
understanding system cannot oe implemented. However, a 
system capable of understanding a subset of natural language 
can prove useful in specific domains. Early examples of 
programming tnrougn natural language dialogue is presented 
in a survey by Eeidorn [b] . Current wortc on understanding 
natural language may be found in Hermann [5J , and Wallcer 
17 ] • 

In conclusion natural language understanding is a 
difficult problem that can be solved only in limited 
domains. Tne use of natural language in programming has been 
shown to be possible by Heidorn [bj , and by Hermann [6J in 
limited domains. The systems developed up to today nave been 
experimental systems and tne results will aid in 
understanding tne problem. Natural language programming 
systems will not be available for industry for at least a 
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decade. Finally, we present tne example Eiermann [5j 
describes as a natural language specification for a problem. 
Tnis example is quoted from nis paper on natural language 
programming. Its intent is to give a feel for programming in 
natural language. Tnis example does not specify tne 
alsoritnm that is to be used althougn a natural language 
programming system would be capable of accepting such a 
specification. 



"Wnen I asic for a status report on a 
doctorial student, give me his or her year 
in grad scnool, source and amount of 
financial support, and which core exams 
have been passed. If the student has begun 
a tnesis give me the advisor and thesis 
topic.” 

3. Formal P roblem Specification 

The second technique is formal specification ot' the 

problem. As the name implies, tne input is in a more rigid 

structure than natural language. Tnis technique allows tne 

user to convey tne benavior ne desires tne synthesized 

program to have without specifying the algorithm that is to 

be used. Smith [Sj gives the following definition for tne 

form of a formal specification of a problem A. 

” A ( x ) = z such that z c S S. P(z,x) where x c D & 

I(x) where D and S are the input and output data 
types respectively, and I and P >t are tne input and 
output conditions respectively." 

An example of a formal problem specification for a program 

to compute the integer square root of a nonnegative integer 

n may be found in Manna and Waldinger l 9 J . 



sqrt(n) <== FIND z SUCH THAT 

integer(z) 5. z**2 =< n < ( z - 1 - l ) ** 2 
WHERE inte.mer(n) d* 0 =< n’ 

In toe above example n is an element of tne input lata type, 
z is an element of tne output lata type, sqrt is the problem 
name, integer(n) 6, 0 =< n is tne input condition, and 

integer(z) d* z**2 =< n < (z + 1) ** 2 is tne output condition. 

Formal problem specification and its application to 
tne program syntnesis problem can best be explained tnrougn 
examination of tne worfc by Manna and Waldineer laj , Manna 
and Waldinger [10J , and Smith [SJ . Altnougn all of tne wont 
is similar in that the formal specification is changed into 
an appropriate program by some form of rewrite. It is 
valuable to differentiate the approaches by their rewriting* 
methods . 

Tne first example is tne system of Manna and 
Waldinger [9J . Tneir system, called a deductive approach, 
converts tne formal specification into a program in sore 
target language. Tneir approacn, "combines techniques of 
unification, mathematical induction, and transformation 
rules into a single system." Tne following is an brief 
explanation of this conversion. 

A structure is needed to contain initial and 
intermediate results of the conversion process. Tnis 
structure is call a sequent. The sequent is a tableau 
containing two lists. Tne first list is a list of assertions 
and tne second list is a list of goals. Eacn element in 
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eitner list nay nave an output expression associated witn 
it. Figure 1 represents a sequent as a table. Eacn row in 
tne table nay contain eitner an assertion or a goal out not 
both. Figure 1 is tne initial sequent tor tne integer square 
root problem given above. Tne input condition nas been 
placed in tne assertion list and tne output condition placed 
in tne goal list. Tne output variable is associated witn tne 
output condition in tne output expresssion column. Tnis 
initiation action assumes tne input condition is true and a 
searcn is attempted for tne trutn of tne goal or output 
condition . 

sqrt(n) <== FIND z SUCH THAT 

integer(z) and z vv 2 =< n 

and n < ( z+1 ) 2 

WHERE integer (n) and 0 =< n 



1 

1 

1 

1 


As serti ons 


! Goals I 

i i 

i i 


Ou t pu t i 

sqrt(n) ! 


1 

1 


in teger ( n ) 


i i 

i i 


1 

1 


1 

1 


and 


i i 

i i 


1 

1 


1 

1 


0 =< n 


i i 

i i 


1 

1 


1 

1 




! integer(z) ! 


1 

1 


1 

1 




! and | 


1 

1 


1 

1 




i z**2 =< n ! 


2 ! 


1 

1 




! and 1 


1 

1 


1 

1 




! n < (z+1) j 


1 

1 



Figure 1. Initialized Sequent for the Square Root Problem 
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During tnis searcn if tne sequent ever contains a row wnere 
the assertion can be trivially snown to be false or the 203I 
snown to be true anl if tne output expression for tnat row 
contains only primitives from the target lane’uas'e then tne 
output expression is taten as tne desired synthesized 
proffam. 

Once tne tableau is initialized, tne system's 
deductive rules are applied to tne assertions and *?oals. The 
application of these rules will cause tne creation cf new 
assertions and eoals and associated output expressions. Tne 
rules may then be applied to the new goals and assertions 
until tne condition for a program is satisfied. Tne 
application of the rules chanee tn entries in the tableau 
without cnanging tne meaning of tne tableau. We recommend 
that the interested reader review tne original wcr£ for a 
description of tne rules and their application. 

Tne attraction of tnis tneo rem-proving technique is 
tnat tne resulting program can be proven correct by tne same 
steps used to create it. Currently tnere is not a running 
implementation of tnis technique. One of tne implementation 
questions is determining what rule to apply at eacn step in 
the synthesis process. Tnis problem can be viewed as a 
searcn through ail possible sequences of rule applications. 
Tnis searcn space may become astronomical for any relatively 
complex program since it may require hundreds of rule 
applications, tfnat is needed is a mechanism tnat can control 
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tne searcn in a reasonable rasnion. Tne form or control may 
oe neuristic in tnat mere is a feel for where a rule snculd 
be applied. If mis intuitive feel can be quantized, tnen 
tnis technique may Decome practical. 

Earlier worn by Manna and Waldinger [1ZJ on tne 
DEDALUS automatic programing system also required formal 
problem specifications. Tne DEDALUS system, an implemented 
automatic programming system, utilized only transformation 
rules. A tranformation rule simply rewrites a portion cf tne 
specification into anotner equivalent form. The continuous 
application of tnese rules would eventually result in a 
program in the target language. 

4. Input-Output Pair Specification 

Input-output pairs is a metnod of describing a 
problem witn examples of input and output benavior. For 
example, if someone wanted to describe a program to compute 
tne Fibonacci numbers tnen ne could supply tne input-output 
pairs. 

( 1 , 1 ) 

(2, 3) 

(3, b) 

(5, 9) 

(8,13) 

Tne goal of a syntnesizer system is to determine tne 
desired program from the examples of the input-output 
benavior. One approach is to enumerate all possible programs 
in the target language in order and test each program for 
tne desired benavior. Tnat is, test each enumerated program 
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by divine it tne input from eacn of the examples ana see if 
tne program will give tne associate! output. Tne enumeration 
will produce tne correct program at some point but you 
cannot determine if an arbitrary program can produce tne 
desired behavior (see Biermann [lij). Tnerefore, tne 
following tneorem is given by Biermann, "Tne programs for 
tne partial recursive functions cannot ce generated from 
sample of input-output behavior.” A large class of programs 
may be inferred from examples of input-output pairs provided 
they belonsr to tne class of proerams w.nere tne halting 
problem is decidable. Smitn [12] and Summers [13] nave 
looted at the synthesis of LISP programs for example 
input-output pairs. It has been shown that a restricted 
class of LISP programs can be synthesized from example pairs 
without enumeration over the class. The reader is invitee to 
review Biermann [14j and Gold [15J for tneoretical 
bact^round information. 

5. Example Computations 

Program specification using example computations 
allows more information to be obtained from tne user. An 
example computation is a sequence of instructions, without 
an explicit control structure, which the user provides tne 
system in order to describe the benavior he wants from a 
program. Examples are a good communication method which 
people use to describe new concepts or explain new 
processes. To describe a problem to the computer the user 
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uses me available instructions ana proviaes an example of 
wnat he wants hone. Figure 2 snows an example computation 
tnat demonstrates now to compute tne first I'd Fibonacci 
numbers . 

In Figure 2 tne two operand instructions (^07, ADD) 
perform the action on the two operands and leave the result 
in tne first operand. For example, if A = 2 and B = 3 then 
ADD A t B would result in A = 6 and B = 3. All of tne 
instructions perform action on some variables execpt for the 
START, HALT, and NOTE instruction. START and HALT flag tne 
begin and end of tne program respectively. The NOTE 
instruction is providing information on tne reason for tne 
execution of tne next instruction. 

This method of specification depends on tne user to 
supply more information about tne problem, including the 
algorithm to be syntnesized. Tne algoritnm is implicitly 
defined by the example computation that is given. This 
specification technique snould be contrasted with the 
previous techniques. Note that the formal specification and 
the input-output pair specification only required tne user 
to specify the desired benavior witnout specifying tne 
algorithm. Thus it can be claimed that these two methods 
intentionally ignore information tnat tne user nas, assuming 
that most users have an idea of tne form of the algorithm. 
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START 





A » 1 


MOV 


£ 1 1 


MOV 


C ,10 


PRINT 


H 


DC it 


c 


ADD 


A 


PRINT 


A 


DCR 


C 


ADD 


E , A 


PRINT 


£ 


DCR 

• 

• 


C 


• 


PRINT 


A 


TCR 


C 


NOTE 

HALT 


C =< 




Figure 2. An Example Compulation 
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Tne primary contributor to tne understanding of 
program syntnesis tias been Alan . Biermann (see Biermann 
and iri snnas wamy [16J and Biermann, Baum and Petry 117J ) . In 
particular, Biermann [ 16 ] provides a formal definition of an 
aigoritnm tnat will syntnesize programs from example 
computations. Tne aleoritn.n and variations nave provided tne 
basic structure upon wnicn tnis tnesis nas been developed. 
Briefly, tne aigoritnm identifies tne conditions tnat may 
nave inadvertently (or purposely) been left out of tne 
computation. A condition is a predicate as defined in 
predicate calculus. Tnat is, an entity for wnich a trutn 
value may be measured. Once tne omitted conditions nave been 
inserted, tne aigoritnm finds a labelling for tne 
instructions sucn tnat a program witn a minimum number of 
instructions is produced. To explain tnis labelling, assume 
tne instruction ADD A,B appears in tnree different locations 
in an example computation (see Figure 2). Suppose it was 
Known that tnere nas to oe two occurrences of tne 
instruction. Tnen two of tne instructions could be labeled 
witn a l and tne otner instruction labeled witn a 2 to 
indicate tnat tne instruction labeled 2 is different from 
tne instructions labeled 1. Finding tne labels for tne 
instructions in tne example computations requires an 
enumeration searcn of all possible labellings. The labelling 
selected is tne first labelling tnat produces a program tnat 
is deterministic. 



28 



This alsoritnm is complete and tne synthesized 
programs are sound. Completeness means tnat tne algoritnm 
can synthesize every possible proeram. Soundness mean tnat 
the synthesize program will correctly execute tne example 
used to construct it. A disadvantage of tnis syntnesis 
method is the aleorithm is an enumeration search and in the 
worst case will require exponential time on tne lengtn of 
the example computation to find a solution. Techniques nave 
been developed to speed up this search that will produce 
satisfactory response for most praticai programs. 

6. A General Automatic Programmer Design 

Before leaving tnis section on automatic program we 
wish to discuss a design for an automatic proerammer that 
uses at least two of tne specification techniques. Tne name 
of the system is PS I and was designed by a sroup of 
researchers at Stanford's Artificial Intelligence 
Laboratory. Tne researcn effort was headed by Cordell Green 
[3] . Green has presented a high level design of an 
autoprogrammer tnat identifies some of tne more important 
areas that need further research. Green admits that the 
design was an effort to focus attention on some of tne 
sub-areas of tne overall synthesis problem. His modular 
design does focus attention on different aspects of tne 
problem. The design decision to split tne overall problem 
into two main sub-problems of acquistion and syntnesis is of 
particular interest. This design choice allows worK to 
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proceed concurrently on two nard problems witn tne interface 
between tne proolems being some intermediate representation 
of tne problem. 

PSI is a Snowiedge-based program understanding 
system organized as a collection of interacting modules. 
Figure 3 details tne high level modular design of tne PSI 
system. Tne PSI design divides tne system into two groups. 
The acquisition group interfaces witn tne user and collects 
tne specification given by tne user wnile tne syn tnesi s 
group produces a program in some target language tnat meets 
the user's requirements. Communications between tne two 
major groups is tnrougn an intermediate representation 
called tne program model. Tne goal of tne acquisition group 
is to accept tne user's specification by eitner natural 
language dialogue or by traces, and present a unified entity 
to tne syntnesizer group. Tne implementation of tne 
syntnesizer group is then simplified because of tne 
consistent representation it receives. Since tne user's 
input is converted into an intermediate representation tnat 
is supplied to tne syntnesizer group, tne user is free to 
switcn from one specification tecnmque to anotner during 
program specification. 

Tne overall interaction witn tne user is meant to oe 
through natural language dialogue. Since natural language 
understanding is not currently within tne 
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Figure 3. PSI's Modular Design [3,p.6j 



state of tne art, the system must interact in a subset of 
natural language limited to a particular domain. 

Tne system-user interaction is to appear as natural 
as possible. Tne system nas been designed to include a 
mixed-initiative dialogue capability wnicn means tne user or 
tne computer can assume tne dominant communication role at 
different times luring tne discourse. Tnis allows tne user 
to provide as mucn Knowledge as ne can to nelp tne syntnesis 
process and allows tne computer to assist tne user by as King 
questions or providing responses. Tne system develops a 
current model of tne user and a model of tne context tnat 
assists tne system in determining wnen to assume tne 
initiative and wnat questions to asK tne user. 

A partial implementation was completed in 19? 5 tnat 
included tne syntnesis expert and tne efficiency expert from 
tne syntnesis group. Tne acquisition group modules nave 
proven to be a more difficult assignment and only portions 
of tne acquistion group nave been implemented. Tne important 
point of tne FSI design is tnat it provides a modular 
division of tne program syntnesis problem tnat nelps provose 
study into tnese sub-problems. 

C. OBJECTIVES 

Automatic programmers, wnicn syntnesize programs from 
example computations, require conditions to be explicitly 
defined by tne user in order to generate programs witn a 
minimum number of instructions. Previous wort ( Biermann and 
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Krisnnaswamy llbj , and Bisrmann I. 18 J ) nas reduced tne 
number of required conditions, but nas not eliminated tne 
need for tne user to explicitly state a minimal set of 
condition s . 

The explicit definition of conditions is not a natural 
part of an example computation. Tnat is, one would not 
normally give control structure information wnen using 
examples to explain now a tasK is to be performed. Our 
objective is to provide an environment wnere tne user may 
define tne tasts ne wants accomplisned witnout explicitly 
defining tne control structures tnat specify tne flow of 
execution in a syntnesized program. 

We will implement an automatic programming system based 
upon tne example computation specification metnod in order 
to study tne feasibility of identifying conditions from user 
actions. We limit tnis study to tne domain of text editing 
in order to provide a well defined area in wnicn to wo rx . It 
is doped tnat tne results of our efforts may provide insight 
into tne overall problem and generate furtner researcn wnicn 
will extend condition identification to otner domains. 

D. THESIS OmNIZHTION 

The thrust of tnis thesis is tne developement of methods 
for tne automatic construction of conditions necessary for 
the proper synthesis of programs from example computations. 
Example computation is one approach to the problem of 
program synthesis. Chapter One introduces tne reader to 
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program synthesis and gives a brief historical perspective 
of the evolution of tnis field of study. Cnapter One also 
provides a comparison of tne different proposed approacnes 
to tnis problem. 

An automatic programmer nas been implemented to support 
tnis researcn. Tnis syntnesizer was developed to use tne 
example computation metnod for program specification. 
Cnapter Two is a detailed explanation of our particular 
implementation. Cnapter Two includes a discussion of 
techniques we nave incorporated in our implementation which 
speed up tne syntnesis process. 

Chapter Three presents our approach to generating 
conditions given an example computation. It describes 
algorithms which will generate conditions from a sequence of 
editor instructions. 



Chapter 


Four discusses 


tne result 


of 


our research. A 


brief discussion 


is 


included on 


tne 


merits of tne 


synthesizer 


which we 


nave 


implemented 


and 


recommenda ti ons 



are given for potential improvement. Finally, Cnapter Four 
presents a review of our wore on identification and 
construction of condtions from example computations. Areas 
requiring further researcn have been highlighted and 
examples of possible applications to otner domains nave been 
pointed out. 
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I I . SYNTHESIZER 



A. GOALS 

Ttiere is a two-fold purpose benind designing and 
building tne program synthesizer. Tne first directly rpiates 
to the usefulness of tne syntnesiz er . It is doped that by 
"laying tne groundwork" for an autoprogramming system, tne 
impetus will be provided tnat will eventually result in a 
total automatic programming environment teing available for 
the user. This environment is envisioned as an interactive 
one consisting of several components: an interface to 
provide tne user with the means to perform example 
computations, a link between tne interface and tne 
synthesizer which records tne user actions and transmits a 
trace of those actions to tne synthesizer, tne synthesizer 
itself which produces the algorithm in some internal form, 
and, finally, a translator tnat receives tne internal 
representation of the algoritnm and translates it into 
machine-readable form and/or user-readable form. The second 
purpose for wnicn the synthesizer is built is to orovide a 
suitable vehicle to be used in the main area of research 
tnat tnis thesis explores. If an autoprogrammer can generate 
correct algorithms from example computations, how much can 
be done to relieve the user from naving to include branching 
or looping conditions in his example computations? 
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B. OVERVIEW 



1 . General Description 

An automatic proerammi ns system wnicn proiuces 
programs based upon tne user's input of example computations 
nas a natural appeal. Example computations are seouences of 
instructions performed in an algorithmic manner. For 
instance, if tne user is doing a matrix multiply, computing 
tne entry for tne resultant matrix involves tne sum of 
products from tne appropriate row and column of tne 
multiplicand and multiplier matrices, respectively, wnen 
numans communicate ideas to eacn otner, tne proper use of 
example computations often plays a vital role. It is nard to 
imagine trving to explain tne metnod of multiplying two 
matrices togetner, or trying to explain tne concept of 
set-subset relationsnips witnout being able to draw examples 
t.nat ennance tne explanations. Tnis metncd of communication 
seems to be vital to numan understanding of aie-ori tnms . 
Since programmers often use small example computations wnile 
coding programs, it seems tnat a logical appro a cn to 
automatic programming would consist of tne macnine doing tne 
actual program syntnesis based upon example computations 
given by tne programmer. 

Program syntnesis is tne act of putting instructions 
togetner in sucn a way tnat an algoritnm is built wnicn 
accomplisnes a desired tasic. Obviously, an algoritnm wnicn 
is an exact replication of tne sequence of instructions will 
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accomplish tne task, tut it is uninteresting since it cannot 
be generalized to accomplish a set of related tasks. For 
example, a linear sequence of instructions wnicn multiplies 
two 2x2 matrices together will only work for 2x2 
matrices; nowever, by allowing loop constructs and if-tnen 
constructs, an algorithm can be produced which performs the 
more general task of multiplying any two matrices with legal 
row and column dimensions. So, in the case of tne matrix 
multiply, the task of the program synthesizer is to produce 
a general matrix multiply algorithm given tne example 
computation for a 2 x 2 matrix multiplication in some form 
such as: 



C [1 ,1 J = a [l ,1 J 


* 


b Cl, 1 J 


+ a[i,2j 


5? 


b [2 ,lj 


c [1 ,2] = a [1 »1] 




b Ll ,2] 


+ a[l,2j 




b L2 ,2J 


c [2 ,lj = a [2 ,1 J 


* 


b [1 ,1J 


* a [2 ,2J 


* 


D [2 , 1 J 


c [2 , 2] = a [2,1] 




b (.1 ,2] 


+ a [2 ,2J 




bl2,2j 



Generalizing from tne example computation also 
requires some means of noting when the array bounds have 
been reached for this example. In otner words, conditions 
have to be interposed between some instructions where a 



cnange in 


tne 


flow 


of control for tne 


algori tnm 


i s 


necessary. 


An 


i nout 


trace 


is defined as 


a sequence 


of 


instructions 


and 


condi tion s 


wnicn describes 


tne example 


computa ti on . 


In 


tne 


matrix 


multiply example 


this might 


be 



accomplished tnusly: 
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C[1,1J = 0 

C[l t lJ = c U ,1] + All . 1 J - BU ,lj 

C[1,1J = C [1, 1J + A[l,2] * B[2,l] 

COND - col index of A = col size of A 
C [ 1 , 2] = <£ 

C [l , 2j = C [l ♦ 2J + A [l , lj * B [1 ,2J 

C 11*23 = C [l , 2] + A[l,2J * B[2,2J 

COND - col index of A = col size of A 

• 

C [2 , 2 J = C [2 , 2J + * A [2 , 2J * B[2,2j 

COND - row 6. col index of C = Dimension of C 

STOP 

The program synthesizer used for this thesis is 
designed around concepts and ideas on synthesizing a program 
given example traces as described in reference L 1 V J . 
Previous research, references [16] , [17J , and [18] , seems to 
indicate that correct programs can be synthesized on the 
basis of relatively few sample computations, but that tne 
amount of time required to do tne synthesis grows very 
quicKly as a function of program complexity. 

2 . Trace Coding 

Tne synthesis procedure is domain independent? that 
is, the input trace can be coded into any consistent 
representation, and it will not affect the operation of the 
synthesizer. Since tne synthesis procedure is independent of 
the input trace representation, alphanumeric characters will 
be used to represent instructions and conditions. They are 
distinguished from each other by their position within the 
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trace ratner tnan by their symbolic representation. For 
example, an 'a' mient represent an instruction or a 
condition. Within tne instruction set itself, identical 
instructions are encoded as identical symbols. A simple 
trace of a routine to find all positive nunDers in an input 
stream mient be: 

A = 0 
READ E 

COND - B is negative 

A = A + 1 
READ B 

COND - B is negative 

A = A + 1 
READ B 

COND -Bis positive 
PRINT B 

• 

If tne instruction A=A+1 is represented by a 'b', eacn 
occurrence of tnat instruction in tne trace will nave to be 
represented by a 'b ' . Tne reason for tnis constraint is 
obvious. Since tne synthesizer only receives a trace of tne 
example execution, it cannot determine wnetner A=A+1 is tne 
same instruction bein? encountered repeatedly in a loop, as 
it is in tnis example, or wnetner there are several 
independent occurrences of A=A+1. Fisure 4 is an example of 
a typical coded input trace. Tne left-nand column entries 
are conditions and tne rignt-nand column entries are 
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instructions. Figure 4 is read as state 's' translstions on 
condition 'x' to state 'a' wnicn in turn transitions on 'x' 
to state 'b', and so fortn. 



transitions states 

s 

x a 

x b 

x c 

X D 

y e 

X c 

x b 

X c 

y s 

y a 

x b 

y 4 

x b 

x f 

x d 

x b 

x f 

x d 

y s 



Figure 4. Input Trace 



3. Input/Output Trace Representation 

A *oore-type representation, as defined in [17J , can 
be used to highlight certain features that must be dealt 



vitn wnen producing an algoritnm from an example trace. 
Throughout the rest of the discussion, Moore machines and 
algorithms will be used synonymously. Conditions relate to 



transitions 


and 


instructions relate 


to states 


of 


the 


machine. In 


fact , 


tne function of tne 


synthesizer 


can 


be 



viewed as that of determining a minimum-state deterministic 



Moo re 



Moore machine equivalent of a non-deterministic 
machine. Representing input traces as Moore machines will 
often show tne non-deterministic structure of tne example 
trace. This non-determinism must be resolved by the 
synthesizer in order for an algorithm to be generated. 
Figure 5 is the Moore machine representation of the inout 
trace of Figure 4. Notice that at node 'b', the trace is 
non-deterministic. Transition 'y ' leads from node 'b' to two 
different nodes? similarly, transition 'x ' leads from node 
'b' to two separate nodes. Figure 6 is the deterministic 
Moore machine which has been constructed by our synthesizer 
based upon tne input trace given in Figure 4. The 
non-determinism has been resolved by splitting state 'a' 
into two states distinguished from each other by an integer 
prefix label . The assignment of the prefix label is the 
mechanism used by tne synthesizer to prevent 
non-determinism. In order to accomplish this assignment, the 
synthesizer uses an enumeration tecnnique. Eacn instruction 
is assigned a prefix label in a manner that maintains 
determinism and assures that the algorithm will correctly 
execute tne input trace. It is easy to verify that tne 
deterministic Moore machine of Figure 6 will execute the 
trace . 
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s 




Figure 5. Non-aeterministic Moore Macnine 
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Fieure 5. Deterministic Moore Mac nine 



4.5 




SYNTHESIS PROCEDURE 



C . 

1 . Function 

Tne function of tne syntne sizer program is to 
provide a minimum-state, correct program consistent witn tne 
input trace of tne example computation. Tne syntnesis 
process will be completed wnen it is determined vfcicn 
occurrence of a labelled instruction corresponds to eacn 
particular instruction in tne input trace. In order to 
accomplisn tnis goal, tne syntnesizer is basically 

structured as a deptn-first searcn algoritnm. Backup and 
fixup mec nan isms exist to en nance the searcn procedure wnen 
pruning nas not kept tne algoritnm from traversing a 
fruitless brancn of tne searcn tree. Tne searcn mecnanism 
attempts to assign a label to eacn instruction in sucn a 
manner tnat tne generated algoritnm remains tecnnically 
correct; that is, nonieterminism is not allowed to exist and 
the original trace can still be executed. A number cf 
techniques exist within tne synthesizer which aid pruning of 
tne searcn tree, and tnereby mase it possible to synthesize 
more complicated programs in a reasonable amount of time 
tnan could otherwise De expected from a general enumeration 
technique. These techniques offset tne major disadvantage of 
exponential growtn of tne searcn space as a function of 
input which is found in a general enumerative searcn 
technique . 
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V 



Concepts 





Certai n 


definitions 


and 


concepts must oe 


present 


ed 


bef o re 


tne 


actual aleoritnm 


i s 


discussed. In 


order 


to 


facili tate 


tne 


d i scus si on , 


i t 


is necessary to 


refer 


to 


Figure 


7. 


Sacn 


level in 


tne 


figure consists 


0 f 


in 



Instruction-condition-instruction triple . referred to as an 
I-C-I. In Figure 7 tbe leftmost symbol under I-C-I is 
referred to as tne leading instructior, cf tne triple, tne 
middle symbol is tne condition, and tne rientmost symbol is 
tne trailing* Instruction . Tne trailing instruction at level 
i becomes tne leading instruction at level i+1. So tnis 

input trace represents tne instruction-condition sequence 's 
r a n s r a . . . ' . 

level I-C-I 

1 sra 

2 ans 

3 sra 

4 aia 

5 axa 

6 aya 

7 .axa 

9 anr 

Figure 7. Instruction-Condition-Instruction Triple 

Two levels i and j are said to belong to tne same 
couple-class if tne elements of tne level are tne same. 
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Instruction elements or tne trace wmcn are in tne same 



couple-class may De assigned tne same prefix label during 
syntnesis if tne assignment does not cause non-determinism. 
For example, given tne trace in Figure 7, levels 1 and 6 are 
in tne same couple-class, as are levels 5 and 7. Difference 
set relations are anotner situation tnat can exist whicn is 
of interest. Tne first two elements of level i and level j 
are tne same, but tne third element is not tne same. A 
difference set relation indicates tnat tne leading 
instructions cannot be represented by tne same state 
regardless of tne prefix laoel assigned during syntnesis 
because tne leading instruction has tne same transition to 
two different trailing instructions. Again using tne above 
trace, level 2 and level 3 fail into this category. In this 
situation, the index 8 would be entered into the difference 
set for level 2. By implication, tne index 2 is also in tne 
difference set for level 8, altnough, in practice, it is not 
entered . 

Once the initial couple-class information and 
difference set information nave been determined, additional 
difference set information can be obtained tnrougn tne 
chaining nature of differencing . For example, suppose tne 
trace consists of tne one shown in Figure 3. Tnen tne Moore 
machine representation of this trace is snown in Figure 9. 
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inlex 



trace 



b axa 

6 axa 

7 ays 



5 aia 

9 axa 

10 ayt 

Fieure 8. Chaining or Difference Set Relations 




Figure 9. Non-determini s ti c Input Trace 

Tnis macnine is obviously nondeterministic since 
state 'a' transitions by 'y' to two different states. 
Difference set resolution requires tnat tne index for 'ayt' 
be in tne difference set of 'ays'. Since tnat requirement 
causes different states to represent tne 'a' in 'ayt' and in 
'ays', and furtner since tne trailing 'a' in tne preceding 
level is exactly tne same instruction, tne preceding levels 
now satisfy tne difference set relation. The leading 
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instruction and tne condition are tne same, out tne trailing 
instruction in tne I-C-I triple is different since tney nave 
previously been assigned to a difference set relation. 
Tnerefore, tne lead instruction must ce labelled witn a 
different prefix during assignment and similarly, tne levels 
above tnem. So tne Moore macnine will now be deterministic 
and in tne following form. 




Figure 10. Deterministic Trace 

Given a partial trace derived from tne example 
execution, tnere are numerous Moore macnines tnat can be 

constructed to satisfy tne trace. At one end of tne 

✓ 

spectrum, a program can be constructed sucn tnat eacn 
succeeding state is assigned a different prefix label. Tnis 
metnod always results in a straignt-line program. Eacn 
instruction nas one transition entering it and one 

transition exiting from it. Allowing tnis metnod produces 
the maximum size program consistent witn tne input trace. 
See Figure 11. Tnis is not a particularly desirable metnod 
since it does not recognize loop structures that can 
significantly reduce tne size of tne program. Additionally, 
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it Hides tne basic structure of the aleoritnm. Tne rrajor 
advantage, of course, is that absolutely no searcn is 
required to produce a deterministic machine. 



condi tion 

x 

x 

X 

Figure 11a 



instruction 

a 

a 

a 

a 

. Trace 
Figure 11. 




Figure lib. Program 
S traignt-li ne program 



On the other end of the spectrum, a program can be 
constructed sucn that each identical instruction receives 
the same prefix label. This method taxes full advantage of 
loop structures, and will result in a minimum state machine. 
However, such a metnod will seldom produce a deterministic 
machine! therefore, it will not produce a satisfactory 
algorithm. See Figure 12. 



level cond i ns tr 

1 a 

2 x a 

3 x a 

4 x a 

5 y a 

6 y b 

Figure 12a. Trace 



Figure 12. Minimum State Machine 
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Tne best solution lies somewnere between tnese 



endpoints. A reasonable first euess at tne number of states 
required to produce a deterministic macaine witnin tnis 
spectrum can be made by establisnin.e a lower bound on tne 
number of states. Tne cardinality of tne Instruction set is 
defined as tne number of different instructions appearing in 
tne trace. Using tne above figure as an example, it can be 
determined tnat tne cardinality of tne instruction set is 
two? tnat is, tnere are two different instructions, 'a' and 
'b', in tne trace. Tnis measure provides an absolute lower 
bound on the number of states required in the final machine. 
This lower bound can be refined by determining a lower bound 
on the number of states needed for eacn individual 
instruction. Once again, using the above figure as an 
example illustrates tnis concept. Tne instruction 'a' at 
level 5 must be different than the instructions at levels 1 
through 4 because of difference set resolution, or else 
nondeterminism results on tne transition 'y'. Therefore, in 
order to maintain determinism, tne instruction 'a' must be 
allowed at least two states. Summation of tne lower bounds 
for each of tne instructions gives a lower bound on the 
total number of states required for tne macnine. For tnis 
particular example, the program would be venerated as: 
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Fieure 13. Instruction Set Lower Bounds 

If tne searcn space is viewed as a tree structure 
then the levels of the tree can oe associated witn tne 
instructions by assigning tne first instruction in tne input 
trace to tne first level, tne second instruction to tne 
second level, and so fortn. Tne crancning factor at eacn 
level is tne state lower bound computed for tne instruction 
seen at tnat level. Tne prefix label assigned to tne 
instruction is represented by tne specific orancn used to 
traverse to tne next level. 

Tne idea of providing a lower bound on tne number of 

states leads to an iteratively expanding deutn-rirs t searcn . 

. \ 

Vnen all possible combinations of prefix labels nave been 

tried, but tne algorithm remains non-de terminis tic , tne 

lower bound is incremented and tne searcn is restarted from 

tne top level. Vnen tne lower bound is increased, tne searcn 

tree obtains additional patns to tne final solution ty 

increasing tne brancning factor associated with one or more 

instructions. The depth of a successful search into tne tree 
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is restricted by tne lower bound on tne number of nodes 
required by the deterministic macnine. Only wnen a pattern 
of prefix assignments nas been mace wnicn allows tr.e 
algoritnm to remain deterministic and all of tne 
instructions in tne original trace nave been assigned prefix 
labels will tne syntnesis terminate. Tnis mecnanism prevents 
a straight-I ine model from being output as tne algorithm 
unless it is tne only one tnat can satisfy tne input trace. 
More importantly, it provides tne minimum-state 
deterministic macnine capable of executing tne input trace. 

D. SYNTHESIZER STRUCTURE 

The syntnesis program is subdivided into two primary 
modules: static processing of tne input trace; and dynami c 
processing of tne information extracted from tne input trace 
by tne preprocessing, or static processing pnase. Static 
processing provides information sucn as couple-classes, 
difference sets, and lower bounds on tne number of macnine 
states. Dynamic processing uses Knowledge inherited from 
preprocessing to guide tne search mecnanism to a final 
output of the algorithm. These two modules will be discussed 
in turn, and tne primary mechanisms involved will be 
amplified. 

1 . Static Processing 

Static processing can be conceptualized as 
consisting of tnree main functions: (a) accept tne input 
trace; (b) preprocess tne trace for difference sets, 
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couple-classes, and state bounds; and (c) prepare a trace 
table for furtner use by dynamic processing. Once tnis 
preprocessing nas been accomplished , tne static module is no 
longer necessary to tne synthesizer. 

In tne current configuration, tne static module 
expects to find tne input as a sequence of 

instruction-condition-instruction triples. Figure 14 is an 
example of an input trace. 

level trace 

1 anp 

2 psa 

3 aga 

4 ay r 

5 rsr 

6 rsr 

7 rra 

8 a?a 

9 ayt 

Figure 14. Typical Input to Static Processor 
Eacn line consists of a triple, for example 'anp'. 
The 'a' represents an instruction, tne 'n' represents tne 
condition vnicn causes tne program trace to transition to 
the next instruction 'p'. For eacn level, tne first element 
represents tne same instruction as tne last element of tne 
preceding level. Tnis is easier to see if tne above trace is 
represented as a Moore machine in wnich the nodes are 
instructions and tne conditions are transitions. State 'a' 
transitions on condition 'n ' to state 'p' whicn transitions 
on condition 's' to state 'a' which transitions on condition 
'g' back: to state 'a', etc. 
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Fieure 15. Moore Mactiine for Input Trace 
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Figure 16. Intermediate Trace Table 
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Fach occurrence of an instruction symbol in the input trace 
is represented by tne same state at tnis point in tne 
synthesis . 

Once tne input trace nas been accepted, static 
processing can begin. Static processing consists of 
determining the level indices associated with each 
couple-class and with each difference set. For tne trace of 
Figure 15, tnese are shown in Figure 15. 

Tnere are two couple-classes in tnis trace. Tnev are 
[agaj at levels 5 and 8, and [rsrj at levels 5 and 6. The 
remaining levels are not assigned to a couple-class because 
no other levels match with them. Couple-class information is 
useful to the dynamic processor for determining forced 
assignments and dynamic non-equivalence. These ideas will te 
discussed more fully in the section on dynamic processing. 

Difference sets exist for levels 3 and 4. Level 4 
nas a difference set wnicn contains the index y» that is, 
the element at level 4, 'ayt', must nave a different prefix 
label on 'a' than tne element at level 9, 'ayt'. If tne 'a' 
is not labelled differently during tne syntnesis, 
nondeterminism will result since the same transition would 
lead to different nodes. 

Difference set resolution is a very powerful 
mechanism for ensuring deterministic behavior of the 
algorithm. A considerable amount of tne prefix label 
assignments to the nodes can be resolved using difference 
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sets. Notice tnat level 8 appears in tne difference set for 
level 3 even though levels 3 and 9 are in tne same 
couple-class. M first tnis appears contradictory since 
equivalent couple-class names imply tnat tne elements are 
tne same, but difference set existence forces tne lead 
instructions to be different. Tnis points out tne relative 
power of couple-class information and difference set 
information. Difference set information is immutable. 
Couple-class information only nints at equivalence. In tnis 
particular example, tne entry at level 3 was caused by tne 
chaining effect of difference set resolution. Notice tnat 
since tne 'a' at level 4 must be different tnan tne 'a' at 
level 9» and notice tnat since tne trailing 'a' at level 3 
is, by definition, tne same as tne leading 'a' at level 4, 
tne trailing 'a' at level 3 cannot be tne same as tne 
trailing 'a' at level 6 ; tnerefore, levels 3 and 8 cannot be 
in tne same couple-class. 

To compute tne lower bound on tne number of states 
in tne algorithm, tne minimum number of states needed for 
eacn instruction is summed. For tnis same example, tne 
instruction set consists of {a,p,r,t}. Tne bounds for p,r, 
and t are eacn 1. Tne bound for 'a' is 2. Tnere must be at 
least two different occurrences of 'a' from tne difference 
set resolution. Tnerefore, tne minimum number of states vita 
which a deterministic Moore machine can be constructed for 
this trace is 5. 
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Finally, static processing passes all tne 
information concerning tne input trace to tne dynamic 
processor via a trace table in tne following form. Eacn 
level nas only one associated condition and one associated 
instruction. Since difference set information is associated 
with tne lead instruction in an 
instruction-condition-instruction sequence, it is entered at 
tnat level. Since couple-class information is associated 
witn tne entire instruction-condition-instruction sequence, 
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Figure 17. TraceTatle 



2. Dynamic Processing 

Dynamic processing involves assigning prefix labels 
to tne states of tne macnine. In tnis way, separate 
occurrences of tne same instruction are differentiated. Tne 
dynamic processor is tne search mecnanism for tne 
syntnesizer. It operates in suen a way tnat, at any point in 
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the synthesis, tne portion of tne trace previously processed 
represents a deterministic r-'oore macnme. In order to 
maintain tne determinism, dynamic processing steps tnroueh 
tnree pnases:(l) assignment of tne prefix lacei to tne 
instruction; (2) difference set resolution, and (3) dynamic 
equivalence assurance. Additionally, eacn of tnese pnases 
nave built in fixup and backup conditions associated witn 
them. Tne f ixup/oacicup conditions encountered during 
difference set resolution or during dynamic equivalence 
cnecirinsr are indicators tnat, if tne current assignments 
remain tne same, a nonde termini sn will occur in future 
assignments. As sued, tney inform tne pruning mecnanisms of 
tbe searca algorithm. 

An intesral part of tne dynamic processor is tne 
failure memory . It controls the search. Tne failure memory 
may be conceptualized as a L x M matrix wnere L is tne row 
size and corresponds to tne number of levels in tne trace. 
Eacn row nas M columns wnere M is equal to tne lower bound 
assigned to tne instruction contained on tnat level of tne 
trace. An entry into tne failure memory at some level i and 
some column j, where 1 <= i <= L and 1 <= J <= M, prevents 
the assignment of j as a prefix label for tne instruction at 
level i. When a failure memory cell contains an entry it is 
called a valid cell; otherwise it is invalid. Each cell of 
tne failure memory is a two-element entry. Tne structure 
factor is the first element. It indicates which level of tne 
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trace caused tne entry. Tne free state factor is tne second 



element. As tne name indicates, tnis element is a function 
of tne number of free states available at tne time of 
assignment. Tne specifics of tne failure memory operation 
and tne nature of failure memory entries will be discussed 
throughout tne rest of tne section as eacn pnase of tne 
dynamic processor is discussed, 
a. Label Assignment 

As previously mentioned, label assignment is tne 
first function provided by tne dynamic processor. A label 
assignment can be eitner forced or arbitrary . Additionally, 
tne assignment can result in tne creation of a new state, a 
label-name combination not seen before. A forced assignment 
occurs when tne instruction at tne current worging level is 
a member of tne same couple-class as an instruction at a 
prior level, and tne lead instruction into botn of tnose 
levels nas tne same label assignment. Tne current wonting 
level is defined as tne level of tne trace wnicn contains 
tne most recently assigned prefix label, but difference set 
resolution and dynamic equivalence cnectcing nave not been 
completed at tnat level. An example is given in tne trace 
shown in Figure 18. 

Tne label at level 7 is forced by tne label 
assignments at levels A and 5. Notice tnat tne instructions 
at level 5 and at level 7 are in tne same couple-class, 
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Figure IB. Partial Trace Labelling 

and tnat tne instructions at levels 4 and 6 nave tne sane 
prefix label. Tnis condition forces tne instruction at level 
7 to nave tne same prefix label as tne instruction at level 
5. The Moore machine representation of tne partial trace is 
snown in Figure 19. Tne assignment at level B is also forced 
for similar reasons. By definition, any forced assignment 
involves previously assigned states, label-instruction 
combinations, tnat have been seen before; tnerefore, no 
forced assignment can result in a new state. 




Figure 19. Partially Determined Moore Machine 
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Tne failure memory can De used In conjunction 
with forced assignments to signal a Dactup condition to the 
search. If tne failure memory entry corresponding to tne 
label assignment at tne current wortcine level is valid, then 
a contradiction results from the forced assignment. Suppose 
that the trace table and failure memory are as shown in 
Figure 20, and tne forced assignment at level £ nas just 
been made. Tne entry '1.1' at row 2, column 8 of tne failure 
memory is interpreted in the following manner. The integer 
to tne left of tne decimal indicates that the entry was 
caused by the current assignment at level 1. The 'l' to the 
right of the decimal point is the number of free states + 1 
available when tne assignment at level 1 caused tne failure 
memory entry? therefore, when tne entry was made there were 
no free states available. A. free state is one wnicn nas not 
been bound to a particular instruction. 

Tne assignment at level S is forced. In other 
words the sequence of the previous assignments causes the 
prefix label of tne instruction at level 8 to be a 2. 
However, the failure memory contains an entry at row 8 
column 2, FM(8,2). Tnis entry indicates tnat tne instruction 
at level 8 cannot be assigned tne label '2', for if it were 
to be assigned a '2', a nondeterminism will result. To 
resolve tne conflict, bacicup is initiated until tne last 
unforced assignment is found. In this case, the backup is to 
level 6. 
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The assignment at level 6 will he changed and tne searcn 
will continue from there. 



Trace Table 

level coni i ns tr c-c la Del 



Failure Memory 
12 3 
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5 n 

6 r 
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a 4 .2 



1.1 



Figure 20. Trace Table/Failure Memory Configuration 
for a Forced Assignment 

If tne assignment is not forced, tne failure 
memory row corresponding to the current wortting level is 
searched for tne first occurrence of an Invalid cell. An 
invalid cell is one which does not contain a failure memory 
entry. If a cell is invalid, tae assignment of a prefix 
label corresponding to the failure memory column index for 
that cell is possible on that level of tne trace. The column 
number of tne first invalid cell becomes tne label 
assignment for the instruction at that level. For example, 
suppose level 5 is the current wording level and the trace 
table and failure memory have the configuration snown in 
Figure 21. 
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Trace Table 



Failure Memory 



level cond i nstr _1 2 3^ 4 

5 r a 1.1 4.1 

Figure 21. Trace Table Entry Snowing 

Arbitrary Assignment Metnod 

Tne first invalid entry in tne failure memory on 
row 6 is in column 3J therefore, instruction 'a' for level 6 
will be assigned a prefix label of 3. Tnese non-forced 
assignments may result in tne creation of a new state; that 
is, a label-instruction pair not previously assigned during 
tne synthesis. If, at some future point in tne searcn, a 
backup is initiated tnat reaches tnis level of tne trace, 
tne bactcup mecnanism will not stop to perform a retry. At 
any point in the synthesis, all previous levels have 
received assignments based on tne constraint that tne 
minimum number of states has been used consistent with 
maintaining determinism; tnerefore, assigning a different 
prefix label to a state which has been defined as a new 
s ta te only changes tne name of tne state , and does not 
change the structure of tne algoritnm. Since tne structure 
of the algoritnm has not been cnanged, the cause of the 
nondeterminism is still present. 

One other type of assignment should be mentioned 
at tnis point. Pseudo -ass ignment occurs wnen there is only 
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one invalid cell left in a failure memory row at a level 
otner tnan tne current worKing level and tnere are no free 
states available. Although pseudo-assignment does not 
immediately cause a label to oe assigned to tne instruction 
at that level, it does simulate a looK-ahead mecnanism for 
tne searcn technique by triggering difference set resolution 
and dynamic eauivalence checKing as if tnat level of tne 
trace were assigned a value. Since tne pseudo value is tne 
only value currently possible for tnat level, if a bacKup or 
fixup condition is encountered during pseudo assignment, tne 
assignment mecnanism can immediately try another label at 
tne current wording level; thereby saving tne unnecessary 
searcn of a path which it already Knows to be nonproductive. 

Once a tentative label assignment nas been made 
to the instruction at tne current wording level, difference 
set resolution and dynamic equivalence cnecKing can be 
performed. Although these actions may cause a fixup on the 
prefix label at tne current worKing level, their primary 
purpose is to furnish information to the failure memory that 
will nelp guide future label assignments, 
b. Difference Set Resolution 

Difference set resolution prevents future 
assignments being made that are Known to cause 
nondeterminism if tne current assignments remain unchanged. 
Difference sets outline a significant portion of the 
structure of the input trace without regard to label 
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assignments in that they prevent nondeterminism from 
occurring as a result of tne same transition out of a state 
lealine to more than one following state. Consider Fimure 
22 . 




Figure 22. Nondetermini stic Input Trace 

Tnere are several instances wnere difference set 
resolution will force a state to be split into two or mere 
different states. States 'a', 'g' t 'p', and 't' all nave 
nondetermini sti c transitions associated with them. The trace 
table and failure memory configuration for tnis trace is 
shown in Fi?ure 23. 
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element, and tne column corresponding to tne prefix label 
assigned to tne instruction at tne level from wnicn tne 
difference set is beins resolved if tne cell nas not already- 
been made valid tnrougn a previous assignment. For example, 
if tne prefix assignment at level 1 is a 'l', tne failure 
memory entries are made in column 1 at levels 3,5,15,18. 
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Similarly, wnen tne assignment '1 ' is mane at level 2, 
failure entries are made at levels 4 and 11. Now wren the 
assignment at level 3 is made, tne dynamic processor will 
not try to assien a prefix value of 'l' since tne failure 
memory cell at (3,1) is valid. Tne assignment will 
automatically be '2'. Notice tnat at level 5 tne previous 
assignments nave caused tne prefix label to be a '3' . In 
otner words, tne failure memory nas caused tne searcn tree 
to be pruned so that an assignment of ' l' or '2' will not be 
tried. Either one of these assignments would nave resulted 
in nondeterminism being introduced into tne trace at level 
6 . 
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Figure 24b. Prefix Label Equals 2 
Figure 24. Nondeterministic Prefix label Assignments 

While failure memory entries are being made 
under difference set resolution, it is possible for a row to 
nave all cells valid except one. Tnis nas Deen previously 
defined as a situation leading to pseudo-assignment. This 
situation nas occurred at level ll in tne example given in 
Figure 23. When sucn an occurrence happens a looic-anead 
mechanism is triggered to resolve the difference set at tnat 
level. In tnis example, tne failure memory cell at (21,3) 
has been validated with an entry which indicates the current 
worsing level as level 4 wnen tne pseudo-assignment occurred 
at level 11. Another situation which can occur in a failure 
memory row is wnen all the entries in the row become valid. 
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This condition is 
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This condition is called a fence . 

Sines the search mecnanism always Knows tne 
level from which it is doing looK-anead by difference set 
resolution, it is able to perform a fixup on tne label 
assignment at the earliest possible time. A fixup is 
accomplished by incrementing tne prefix label by one. If an 
entire row in the failure memory becomes valid and there are 
no free states available a fixup must be performed on tne 
label assignment at tne current worsting level. If the label 
is left the same, then when the search reaches the fenced 

level, no assignment will be possible. Each time a fixup 

occurs, all entries made in the failure memory as a result 

of the previous label assignment are deleted, and entries 

are then made based on tne new label, 
c. Dynamic Equivalence 

Couple-class information furnished by static 
processing aids in tne determination of dynamic 
nonequivalence. Dynamic nonequivalence can occur during the 
synthesis at any level below tne current worsting level wnen 
tne couple-classes are equal. Dynamic equivalence results 
wnen instructions in tne same couple-class nave been 
assigned the same prefix label. Consider Figure 25. The 
I — C — I triples at levels 5 and 6 and at levels 11 and 12 are 
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L a era J f therefore, tney are in the sa re couple-ciass. The 
instruction 'a' at level 5 has been assigned a prefix cf 
'2', and the instruction 'a' at level 5 has been assigned a 
prefix of 'l' . Now, if tne instruction at level 11 is 
assigned a prefix of '2' and tne instruction at level 12 is 
assigned a prefix of 'l' f dynamic equivalence will occur. 
Furtner, tne assignment at level 12 will be forced. Dynamic 
non-equivalence results when such an assignment scr.eme 
causes non-determinism. Dynamic equivalence cnecfcing 
functions as a lootc-anead mechanism by preventing tne future 
occurrence of a forced assignment which will result in 
nondeterminism. Suppose tne syntnesizer is inspecting tne 
trace in Figure 6, and has Just assigned the instruction at 
level b a prefix of 'l'. 

Notice that level 12 is in tne same couple-ciass 
as level 6. Since the instruction at each of these levels is 
in tne same couple-ciass, the possibility exists tnat they 
nay be the same instruction. If tne instruction at level 11 
is assigned a label of '2' when the wording level reaches 
that part of the trace, then the assignment at level 12 will 
be a forced assignment of 'l'. However, an entry nas already 
been made in tne failure memory at (12,1) wnicn indicates 
that the instruction at level 12 cannot be assigned a prefix 
label of 1. 
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Figure 25. Trace Taole/Fai lure Memory 

In order to avoid tnis contradiction and a 
bacfcup, dynamic nonequivalence processing causes an entry at 
(11,2) of tne failure memory wnicn corresponds to tne 
labelling of '2' given to tee instruction at level 5. Once 
tnis is accompl isned , wnen tne wonting level descends to 
level 11, an assignment of ' 2 ' cannot be made and as a 
result, tne assignment at level 12 will no longer be forced 
by dynamic equivalence wnicn ?ives tne synthesizer a cnance 
to try otner assignments tnat will maintain determinism of 
tne aleoritnm. 

Pseudo-assignment conditions and fixup 
conditions can occur in tne failure memory as a result of 
validation of all but one of tne failure memory cells in a 
row in tne same manner that they occur in difference set 
resolution. Additionally, dynamic equivalency and difference 
set resolution can interact to cause failure memory entries 
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in the following manner. If a failure memory entry is male 
by difference set resolution at any level wnicn is in tne 
same couple-class as a level previously assigned a prefix 
label, and if tne failure memory entry prevents tne 
assignment that will cause the instructions to become part 
of tne same state, tnen dynamic nonequivalence will result; 
therefore, an entry must be made in tne failure memory to 
indicate this condition. 

3. BacKup/Fixu p 

Tne discussion of backup and fixup conditions nas 
been saved until last. The basic idea behind constructing 
tne synthesizer is to provide as mucn information as 
possible to tne search mechanism, and thereby direct the 
label assignment witn a minimal number of retries. With this 
in mind baceup and fixup become last resorts. 

The fixup operation attempts to resolve 
nondeterminism by incrementing the label at tne current 
worfeine level wnen a contradiction occurs. If the newly 
incremented label is not a legal assignment or does not 
correct tne contradiction, then backup must be initiated. 
Tne fixup operation cannot be attempted if tne assignment at 
the current wording level is forced or if the assignment 
created a new state. In eitner of tnese cases, a fixup 
operation would leave nondeterminism in the algorithm. 

If a fixup fails, or cannot be attempted, backup is 
initiated. Bactup must be Initiated from tne current wording 
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level wtien any level is discovered which contains one of 
these conditions: 

1) Tne label assignment is forced and tne failure memory 
cell corresponding to tnat level and label is valid. 

2) Tne label assignment causes a contradiction and 
represents a new state, or 

3) There is no free state available for tne instruction 
at a particular level, and all entries in tne failure 
memory row at that level are valid. 

Tne backup begins at tne current worsting level regardless cf 
which level triggered the mechanism, and continues until 
none of the three conditions given above are present. At 
tnat level a fixup operation is attempted and tne search 
begins anew. Any entries into tne failure memory which were 
caused by levels greater tnan or equal to tne new current 
worsting level are invalidated by resetting the failure 
memory entries to (0,0). Additionally, any assignments are 
deleted along with their side-effects, such as annotations 
on forced assignments and new states. If oacitup causes the 
worsting level to be decremented to zero, a free state is 
added for the use of the first instruction needing more 
states tnan initially allotted as tne lower bound. 
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III. PREPROCESSOR 



A. PROBLEM SPECIFICATION 

The program synthesizer expects a set of triples where 
each triple is an instruction, a condition, and an 
instruction. Biermann [2J nas shown tnat conditions 
inadvertently or purposely omitted by tne user may be 
inserted into a trace. The algorithm for insertion of 
conditions collects tne set of atoms seen on the transitions 
for an instruction. An atom is an entity whicn nas a value 
of either 'true' or 'false'. A cond i tl on is composed by 
logical conjunction and disjunction operations on atoms. For 
example, an atom may be 'c <= £ ' , but a condition may be 'c 
<=0 and a = 4'. A set of min terms is computed from tne set 
of atoms and one of the minterms is inserted after each 
occurrence of that instruction in tne trace. If {a,b> is a 
set of atoms, tnen tne set of minterms will be 
i {a , b} , 1-a , b > , { a , -b } , {-a , -b}> wnere - stands for logical 
negation. It nas been shown in reference [16J tnat only one 
of the minterms can be true for each occurrence of a 
transition from any single instruction. 

One problem witn tne algoritnm is tnat it is incapable 
of inserting conditions if the user nas failed to supply ary 
atoms after a particular instruction. For example, if tne 
user should specify instruction 11 followed by instruction 
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12 in one part of tne trace and instruction II followed by 

13 in another part of the trace, but tne user fails to 
provide a condition after eitner occurrence of II, tnen tne 
algorithm will he unable to venerate a condition for 11. It 
is assumed tnat II does not appear witn an atom eisewnere in 
tne trace. Tne synthesizer will force two states for II to 
resolve any nondeterminism. Tnis mecnanism is fully 
explained in Section II. If conditions nad teen supplied in 
tne above example, tne difference in tne two programs would 
be tne number of states assigned to instruction II. Figure 
2b snows a partial computation without explicitly expressed 
conditions along witn tne associated synthesized program 
fragment. Figure 25 assumes that II does not appear 
elsewhere in tne trace. Figure 27 is a representation of tne 
same partial computation except tnat tne conditions cl and 
c2 have been explicitly expressed. Tne computations in both 
figures are tne same, and eacn program fragment will 
correctly execute eitner trace; tnerefore, the programs must 
be equivalent programs witn respect to program benavior. 
However the proeram in Figure 27 is minimal in that it 
contains fewer states because tne user explicitly supplied 
tne conditions. 
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(S II , 12 II , 13, . . .H) 



Example Computation 




Syntnesi zed Program 

Figure 26. Computation witnout Explicit Conditions 



(S,....,Il,cl,I2,...,Il f I3,...,E) 
Example Computation 




Synthesized Program 

Figure 27. Computation witn Explicit Conditions 



We intend to show that tnere are mechanisms which can be 
used to automatically generate tne necessary conditions for 
the correct synthesis of an algorithm produced by an example 
computation witnout tne user explicitly defining them. The 
problem may be described as follows. Given an example 
computation without explicitly defined conditions, infer 
tnose conditions necessary to control tne flow of 
computation in a manner such that the synthesized program 
will demonstrate tne benavior desired by tne user. In order 
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to facilitate the solution to trie problem, a condition will 
be viewed as a function tnat returns a value of 'true' or 
'false' when called rather than a logical operation on 
atomic boolean entities. Tne problem can then be thought of 
as constructing a function. 

Very little information is available to tne current 
version of the synthesizer when the user provides only a 
sequence of instructions. Certainly not enougn to generate 
minimal programs as described in Figure 27. This led us to 
search for other sources of information that would allow us 
to construct tne necessary conditions. We soon realized tnat 
the instructions issued by the user do not exist in a 
vacuum. These instructions manipulate data. If tne entire 
computer memory, including registers, is viewed as tne 
domain of interest, then execution of an instruction always 
changes tnis state. Intuitively, tne domain also reflects 
the reason that the user decided to execute a particular 
instruction. A search of a space of tnis size in order to 
determine tne reason is impractical; however, observing only 
those data elements affected by the sequence of instructions 
can often be quite practical and can significantly reduce 
the search space. 

We cnose tne text editing domain as the domain of 
interest since we felt that it would be sufficiently 
interesting to warrant application of synthesis techniques. 
This domain was selected because, first, techniques 
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developed In this domain nay be general enougn for extension 
into otner domains, secondly, tne world for tnis domain can 
be described as tne set of all cnaracters contained in a 
particular text file wnicn maces tne world finite, and 
finally, tne instruction set is small enough to be 
managea ble . 

Altnougn our primary research is directed toward 
studying techniques to apply to automatic condition 
generation, we feel that the synthesizer could be a powerful 
text editor and could provide some useful features not 
normally seen in conventional text editors. Extended 
features could include the ability to capitalize tne first 
letter of every sentence, the ability to capitalize ail 

<i 

small letters in the text, the ability to identify a string 
and perform some operation before, after or on it , or any 
combination of these editing actions. 

Tne wording hypothesis is to nave the user process tne 
text file in a normal manner and have tne synthesizer infer 
a program from his actions. Two requirements were levied 
upon tne user. Tne first requirement on tne user is tnat ne 
must inform the synthesizer when ne desires to have a 
program generated so tnat tne syntnesizer can begin 
monitoring tne user's actions. A great deal of time was 
spent trying to figure out metnods tnat allowed one general 
mechanism to be used to monitor tne user's actions and the 
resulting changes in the text file. Since we could not 
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produce sued a mecnanism, a second reauirement was levied on 
tne user. Tnis requirement recognizes a nasic distinction 
between two different aspects of text editing: context free 
suds ti tuti ons ♦ ana context sensitive substitutions. tfe 
define a context free environment to ce one in wnicn tne 
cnaracter to be operated upon is not dependent on characters 
around it. Capit a li zin£ all occurrences of small letters is 
an example of a context free operation. A context sensitive 
operation is defined as an operation in wnicn tne action to 
be performed on a cnaracter or sequence of cnaracters 
depends upon otaer characters around tne main character of 
interest. Capitalizing the first letter of every sentence is 
a context sensitive operation. Condition inference in a 
context sensitive environment is innerently more difficult 
than in a context free environment in that the condition 
must be constructed from events wnicn require a looir-ahead 
capability not inherent in the synthesizer. The user will be 
free to switch from environment to environment at nis 
convenience. The synthesizer will create program segments 
from each environment which can be used to construct a 
complete proeram by a post-processor. 

B. DESIGN FOR A CONTEXT FREE ENVIRONMENT 
1 . Overview 

Programs that operate on a single entity can be 
constructed by the synthesizer. Figure 28 snows tne 
construction of a program from a trace intended to 



c o -Timun i ca te that tne letter "d" snould be capitalized 
wnerever it appears in tne text file. Tne column labelled 
'trace' contains triples of tne form instruction, condition, 
instruction. B is tne start instruction, R is tne mov° right 
instruction, C is tne capitalize or change instruction and S 
is tne stop instruction, respectively. Tne conditions for 
tnis trace are tne cnaracters seen in tne text file prior to 
tne execution of tne second instruction in earn triple. Tne 
special condition "0" is tne null condition, and is always 
inserted after tne start instruction. 

Tne generated program will correctly execute tne 
trace tnat was used to construct it, and by examination of 
the program it can be snown tnat the program will convert 
all d's to D's in a text file consisting of tne characters 
A, b, C, d, F and G. There are no arcs available for otner 
cnaracters in tne character set. In order to generate a 
program to perform tne same function on an arbitrary text 
file, tne user would be forced to give an example of tne 
desired transition for every character in tne character set. 

Since it is desirable to relieve tne user of tne 
chore of providing an inordinate number of examples in order 
to completely specify tne function, a method is required 
that utilizes a few examples of the types of conditions tnat 
are to appear on tne arcs to generalize tne conditions into 
a more compact and complete form. If a generalization can be 
found, the multiple arcs may be replaced with a more general 
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condition and, therefore, correct programs can De created 
witn fewer examples. However tne combination of arcs between 
nodes must b® accomplished so that determinism is maintained 
or the synthesizer will not create a miminum state machine 
capable of performing tne desired function. Tnat means tnat 
the generalization technique must be able to handle 
conflicts properly. Tne arcs in Figure 28 tnat originate at 
state R and terminate at state R appear to consist of 
elements from tne capital letters and small letters. The 
generalization of { x ! x 6 capital letters) U {z! z £ small 
letters) would appear to be a reasonable replacement for all 
of tne R to R arcs. If this generalizat ion was made a 
conflict would result because the letter 'd' is also an 
element of tne {z ! z £ small letters). 

Trace Synthesized program 



B 0 R 
R A R 
R b R 
R C R 
R d C 
C D R 
R F R 
R G R 
R 0 S 




Figure 28. Synthesizer Action 



2. Structure of tne Condition Preprocessor 

The preprocessor is designed to accumulate Knowledge 
from tne traces it is provided, then use tne Knowledge to 
construct meaningful conditions. The preprocessor scans the 
input trace looKing at tne instructions and characters tnat 
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are seen before tne instructions. Tnis phase extracts pairs 
of instructions from tne trace. Tne trace in Figure 28 would 
nave tne instruction pairs (P.R), (R,R), (R,CM and (C,R^ 



extracted. Attached 


to eacn of tnese pairs is tne set of 


cnaracters that were 


seen between tne pair. Tne preprocessor 


then analyzes tne 


information to determine if a 


generalization can 


be mane from tne set of cnaracters 


associated with eacn 


instruction pair. 


Tne natural 


division mentioned above allows tne 


preprocessor to be 


divided into two modules. The first 


module performs tne 


scanning function wniie tne second 



module analyzes tne information and applies a heuristic to 
provide tne most general condition possible. Tne 
implementation of tne preprocessor will be discussed later, 
but before it can be discussed an explanation of tne data 
structures required by tne preprocessor is needed. 



' 6 . Preprocessor 


Data Structures 


To simplify 


tne problem we define two tvpes of 


instructions in tnis 


domain. Instructions that specify tne 


current location 


of interest are cursor positioning 



ins tructi or.s . Instructions tnat change tne state of tne 
domain are data manipulation instructions . Tne preprocessor 
accepts as input a sequence of instructions and an 
associated sequence of cnaracters. Tne first instruction in 
tne instruction sequence is always tne start instruction 
which does not nave a character associated with it. The last 
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instruction in tne sequence is alwavs a nait instruction. 



Every action performed by the user is cautured and appended 
to tne instruction seouence li^t . Tne cnaracter seouence is 
created in narmony with the instruction seouence. In the 
quiescent state tne cursor will indicate a certain position 
in the text. When the user performs some action sucn as move 
the cursor right, a monitor picscs up tne value in tne old 
position and associates tnat value witn the instruction 
executed by tne user. For example, assume a user has a text 
file in lower case letters tnat ne wants to cnange to all 
upper case letters. Tne user initiates tne synthesizer then 
proceeds across tne line of text cnangine lower case letters 
to upper case letters. For tne purpose of this example, 
assume tne line of text is "change lower case to upper 
case". As the user moves across the line maiin? 
substitutions, tne condition monitor captures the actions 
performed and the characters seen. The example line would 
yield an instruction sequence of (E, C, P. , C, R, C, E, C, 
..., C, S). Tne associated character sequence would be; (c, 
C,h, H, a. A, ..., e, 0). The "c" and "R” in the 
instruction sequence are tne capitalize and move right 
instruction, respectively. Note tnat tne capitalize 
instruction does not reposition the cursor and wnen tne user 
moves tne cursor to tne rignt, tne result of tne capitalize 
instruction is associated witn the move. 
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Anotner lata structure needed by tne preprocessor is 
the A s C T I vector . The ASCII vector is a 128-byte linear 
array witn indices numbered 0 tnrou^n 127. Each byte in the 
array is referenced by the decimal value of a particular 
ASCII character. For example, tne array element reserved for 
tne ASCII character '0' is indexed by 4-9 decimal. Tne arrav 
element reserved for the ASCII character 'a' is indexed by 
66 decimal. Tne vector defines a partition of tne ASCII 
character set by usin? tne following technique. The ASCII 



character .set has 
subsets . 


been divided into eight mutually exclusive 


Subset 0 


Capital letters 


Subset 1 


Small letters 


Subset 2 


Numbers 


Subset 3 


space character <fsp> 


Subset 4 


Symbo Is 


Subset 5 


Punctuation 


Subset 5 


Arithmetic operators 


Subset 7 


Control characters 


The subset name 


is entered into tne ASCII vector at eacn 



cell by converting the ASCII character to its decimal 
equivalent and using tnat value as tne array index. Tne 
default partition is shown in Figure 29. 
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Index 



30 31 ... 39 



65 66 ... 80 




ASCII 0 1 ... 9 A B ... Z 

Figure 29. ASCII Vector 

Tne cnaracter get, nierarcny is defined by tne tree 
structure in Figure 30. Tne tree is related to tne ASCII 
vector tnrougn tne cnaracter subset names contained on eacn 
node one level above tne leaf nodes. For tne default 
nierarcny snown in Figure 30, a zero would be entered in tne 
ASCII vector for all capital letters, and a 1 would be 
entered for all small letters. If a differ°nt partition of 
tne cnaracter set is required tne user can modify tne 
nierarcny or create nis own. An example will be given to 
explain now tne modification may be accomplisned . Assume a 
partition is desired wnere tne vowels are isolated into a 
set. Assume furtner tnat tne tne vowels are to be subdivided 
into capital vowels and small vowels. The nierarcny would be 
modified by placing a son called 'vowels' on tne alpn acetic 
node. Attach to tne new node two sons, ''ailed 'Cap-vowels' 
and 'Small-vowels', with arcs to tne appropriate characters. 
Relabel tne nierarcny so tnat sibling relations are numbered 
in increasing order. Finally, initialize tne ASCII vector 
witn tne new labelling. Ail of tne modifications can be done 
by the system when the user calls for the modif i^ation The 
modified nierarcny is snown in Figure 31. 



85 



! ascii ! 







cl • • • Z 



Figure 30. Default Hierarchy 
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! ASCII ! 





less 

Cap-voxels 



less 

Sral l-vo*els 



Figure 31. Yoait'iea Hierarchy 



Tne next lata structure used by ttie preprocessor is 
the transition table. The transition table contains tne 
knowledge gleaned from scanning tne instruction sequence and 
tne character sequence created by tne monitor. Figure 32 
snows tne format of tne transition table. Tne transition 
table is an array of records with eacn record containing 
information on a transition. In the table, II ana 12 are 
instructions wnere 12 directly follows II in at least one 
place in tne instruction sequence. 'Active-sets' is a field 
that contains information on sets of characters that nave 
been observed by tne monitor on tne transition from 11 to 
12. The fields 'Set-l' tnrougn 'Set-n' contain tne value for 
set name, tne count of tne elements from tne set associated 
witn tne transition and a pointer to a linked list of tne 
elements. The records that would be created for tne trace 
given in Figure 28 would be associated witn tne transitions 
B to R, R to R, R to C, C to R and R to S. 



! II ! 12 j Active-sets j Set-1 j Set-2 ! ... j Set-n ! 

Ill I I II I 

III I I II I 




Figure 32. Format of tne Transition Table 
4 . Implementation 

Tne context free preprocessor consist of two main 
modules? tne scanner and tne insertion modules. Anotner 
important module not part of tne preprocessor is the user 
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nonitor. Tne monitor gainers me aciions of me user and 
creates two arrays. One array contains tne sequence of 
instructions tne user providei and tne otner contains 
information of what was true before an instruction was 
executed. Tne information tnat is garnered is men passed to 
tne appropriate preprocessor. 

Tne example instruction and cnaracter sequences 
ffiven in Fieure 33 will be tne example used to explain tne 
mecnanism of tne preprocessor. Figure 33 is illustrative of 
a collection of actions tnat were performed by some user. 
The user's goal is: Cnan.ee all lower case letters in a text 
file into upper case letters. The user nas activated tne 
condition monitor, positioned tne cursor at tne beginning of 
a line of text and moved right along tne line, changing the 
lower case letters to upper case wnenever one appeared above 
the cursor. Figure 33 is an example of output from me 
monitor assuming tne line tne user processed was "Tne 
numbers 1, 2, 3, 5, 7 ARE prime.”. Tne first column in 
Figure 33 is tne cnaracter array. It contains tne cnaracter 
under tne cursor prior to execution of tne instruction in 
column two. Column two is a trace of tne actions performed 
by the user. The "r" represents tne "move cursor rignt" 
instruction and tne "c” represents a cnange witnout cursor 
reposition instruction. Figure 33 can be read as: The 
cnaracter in column one was observed and tne instruction in 
column two was executed. 
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cnaracter 

vector 


i nstruction 
vector 
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R 
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A 


R 


37 
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39 


E 


R 


• 


• 


• 
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49 
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49 


E 


R 



Figure 33. Monitor Output 
The scan module of me preprocessor is activated 
wnen tne user indicates tne representative example is 
complete. Let 'inst-index' De an index for tne instruction 
array tnat is initialized to 1. Tne first step is to create 
a transition from tne start instruction to tne first 
instruction in the instruction array and add tne transition 
to tne transition taoie. Tnis transition will indicate the 
beeinnine of the program and will transition to the first 
instruction provided on a null condition. Tne module then 
moves down the instruction array creatine other transitions 
and adding tnem to toe transition table. Duplicate 
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transitions will not appear in tne t a me . A. transl tion i s 



defined as a pair (11,12), II and 12 are instructions and 12 
follows 11 within tne instruction array. Tne instruction 
array in Fi?ure 33 yields transitions (fi,C), (C,R), (R,R). 

Tne transitions are constructed by indexing through 
tne instruction array. Tne instruction at inst-index and 
inst-index * l form a transition. Tne transition is tne 
matcn against tne transition table. If a matcn occurs, tne 
cnaracter in tne character array at inst-index + 1 is 
extracted and its ASCII value is used to index into tne 
ASCII vector. Tne value stored in tne ASCII vector is used 
as an exponent for two and stored in a temporary variable. A 
bit by bit logical OR is performed between tne temporary 
variable and tne Active-sets variable for tne transition and 
tne result is stored in Active-sets. Active-sets contains 
tne information of every set from tne partition that has 
elements seen on tne transition. Tne operation described 
above allocates one bit for eac.n set in tne partition. If 
Active-sets equals 1 then bit one of Active-sets is a 1 
signifying at least one element of set 1 nas been seen on 
this transition. A two would signify tnat some element of 
set two had been seen and a three would signify tnat some 
element of set one and some element of set two nad been 
seen. 

In tne transition table are fields for each set tnat 
nas been determined to be active for tne transition. Within 
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eacn of tne set fields mere are tnree subfields, tne first 
is the set narre , the second is a count of the elements seen 
for tne set and tne last is a pointer to tne start of a 
circularly linffed list containing tne elements used from tne 
set. The value that was obtained from tne ASCII vector is 
used as a set name and matcned against eacn of tne set 
fields' set name. If the set name matches an entry tne 
character at inst-index + 1 is added to tne United list in 
lexicographical order if not already on tne list and tne 
count is incremented by one. If a matcn does not occur on 
tne set name a new set field is created and ^iven tne name 
that was obtained from the ASCII vector, tne count is set to 
one, and tne character is put on tne list. 

When tne scan module reaches the end of the input, 
tne transition table contains an entry for each transition 
that was seen. Each transition is associated with all tne 
sets that nad elements seen with tne transition. Finally 
eacn transition is associated with tne actual elements 
through tne linked list for each set. The information is 
tnen passed to the insertion module for analysis. Figure 34 
shows the completed transition table and the United list of 
elements for eacn set. 

Once a completed transition table has been created, 
control is passed to tne insertion module. The insertion 
module processes the information in the transition table and 
assigns a condition for each transition. 
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<1> <2> <3> 



<4> <5> <6> <7> 




NOTE: Tne notation <1>, <2> f etc. represents a pointer to 
the United list headed by the same symbol. 



Figure 34. Completed Transition Table 
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The Active-sets entries provide an efficient 
mecnanism for recognizing potential conflicts on emanating 
arcs. Performing a tit ty bit AND on tfte Active-sets entries 
tnat nave a common originating intruction yields tne source 
of conflicts. Tne bit positions tnat are on (bit equals l) 
are tne set (or sets) tnat nave ft ad elements on multiple 
transitions. For example, let (11,12) and (11,13) oe entries 
in tfte transition table with Active-sets value of five (0101 
binary) and tnree (0011 binary) respectively. Let Q equal 
tne result of tne bit by bit AND of tne Active-sets values 
given above (i.e. 0001). 0 indicates tnat tnere is a 
conflict between tne transition (Il,I2) and tne transition 
(11,13). Furthermore, Q indicates tnat tne set causing tne 
conflict is labelled zero in tne hierarchy of Figure 30 
because tne on bit is in tne rignt most position wnicn 
corresponds to two raised to tne zero exponent. Usin^ the 
exponent to enter tne nierarcny, it can be determined tnat 
capital letters were seen on both transitions. Once all the 
conflicts for transitions witn tne same originating 
intruction are Known, the conflicts must be resolved before 
an assignment of conditions can be made. 

Extending tne example given above, assume that eignt 
capital letters were seen on transition (11,12) and four 
capital letters were seen on tne transition (11,13). A 
partial condition can be constructed for the transition 
(11,12) as a set difference between tne set of capital 
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letters and tne actual elements seen on tne transition 
(11,13). Tne partial condition for tne (11,13) transition 
becomes the set of capital letters tnat were actually seen 
with this transition. The initial conditions for these 
transitions become tne union of tne sets indicated in 
Active-sets as not being in conflict and tne sets created by 
tne resolution of conficts. Tnerefore, tne condition for 
(11,12) is ({ x ! i €. capital letters) - {x|x e capital 
letters on otner transitions)) U {x|x e numeric), and the 
condition for (11,13) becomes { z \ z c ({actual capital 
letters seen) U {small letters))). In this example, it was 
assumed tnat tne sets, numeric and small letters, were an 
appropriate generalization for the transition. In practice 
it cannot be done without consideration of the number of 
elements that have been seen from the set on the transition. 
If the count field for the set exceeds a tnreshold value for 
the set, the generalization may be made, otnerwise tne 
elements tnemselves become the partial condition for tne 
transi tion. 

After a condition nas been constructed for a 
transition, a final strong generalization technique is 
employed. The Active-sets value for the transition again 
supplies tne starting point for tnis tecnnique. Notice 
adjacent bits in Active-sets correspond to adjacent nodes in 
the hierarchy. Therefore, a cnecK is made of tne Active-sets 
to see if it has adjacent bits with a value of one. If it 
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doss then a generalization may be attempted. Assume tne 
condition (({capital letters} - {A E I 0 U}) U {small 
letters} (J {numeric}) nas been constructed for some 
transition. Tne Active-sets value for tnis transition must 
be seven (0111 binary). With tne default nierarcny in Figure 
'67>, a generalization to Alphabetic and tnen to Alpna-numeri c 
would be attempted. Notice tnat a generalization to 

Alpna-numeri c would fail because of a conflict with another 
transition. Intuitively ( {aipna-nume ri c } - {A, E f I, 0, U} ) 
would be a correct choice for tne condition for tnis 

transition. A general procedure for tne construction of 
generalized conditions is given below. 

A set of nodes T = {y , y 2 , — f y n } is 
generalizable to a node X if tne set of node Y form a 
complete and exhaustive set of leaves to tne subtree rooted 
at X. Furtner, a set of nodes Z = {z, , z 2 , — , z m } is 
generalizable to the set W = {w, , w 2 , ... ,w- }, j < m, where 
each w is a generalization of a subset Z. 

IF the condition = F, U F 2 Q ... U F n 
where Fj = z- K - q ( - , i = 1 , n 

where q; c z; (q t - possibly null) 

THEN 

tne condition is set to V - U q- 

i$n 

wnere W is tne smallest set 

# = { w, , w 2 , ... , Wj } 

sucn tnat W generalizes {z, f z £ , ... ♦ z n } 
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C. DESIGN FOR A CONTEXT SENSITIVE ENVIRONMENT 
1 . Overview 

Condition generation in tne context sensitive 
environment is a more difficult tasfc tnan in tne context 
free environment. Tnis difficulty arises from tne scope of 
Knowledge required to maice decisions on wnat a condition is 
to be. Tne conditions tnemselves are more complex because 
tney depend not only on tne cnaracter tnat is being seen, 
out also depend on characters tnat precede and follow tne 
current cnaracter under consideration. Tne following example 
will be used to illustrate tne difficulties ana our solution 
to this problem. Assume a user wisnes to capitalize all 
occurrences of tne word 'time' in some text file. Also 
assume that the word occurs at tne beginning, at tne end, 
and in tne middle of sentences in tne text file. Tne 
question is now to construct a program tnat performs tne 
desired function given only tne actions tne user performs as 
an example of tne required program. 

Tne assumption about the position of the word 'time' 
in tne text file implies tnat tne requested action needs to 
be accomplished on strings tnat nave very different 
cnaracter! sties . Certainly, botn 'time' and 'Time' should be 
capitalized as should 'time,' , 'time?' and 'time<sp>'. On 
the other hand tne string 'time' should not be capitalized 
when it occurs within a word liice 'sometime' or 'timely'. 
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Any generated program tnat Denaves as described 
anove must De able to recognize an occurrence of tne string 
or some variation of tne string. Tne totality of this 
information must be glued together to provide a single 
condition that is descriptive of what tne surrounding 
environment must be Hire before tne action is performed. Tne 
implication is tnat tne condition itself must be able to 
perform cnecsing and looA-anead. In other words, the 
condition for tne transition to tne operation must in fact 
be a procedure which responds 'true' whenever the string of 
interest is recognized. Assume for tne present tnat tne 
string of interest can be discerned from tne user's actions, 
(a hard problem by itself, see Angluin [19J ) one must wonder 
now sucn a procedure can be constructed and tnen inserted 
into tne generated program wnicn performs the function of a 
condition on some transition in tne program. Figure 35 snows 
a procedure which recognizes the word 'time'. Note the 
robustness of the procedure in tnat it distinguishes between 
the differing occurrences of 'time' as mentioned above. 
Figure 35 points out that tne problem is not just generating 
a procedure as a condition but also generating conditions 
within the procedure that is to be the overall condition. 
Tne arcs labeled 'T v t ' and '<SP> v {punctuation}' should 
be noted with interest because they provide tne robustness 
tne condition procedure needs. Tne discovery of arc labels 
for the condition procedure will be discussed next. 
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{ { ASC 1 1 } _- i < S p> } ) 



(<sp> V 



(T v t ) 



{Pune. } ) 




(<sp> v 



{Pune . } ) 



'Requested ' 
! Opera ti on ! 



Figure 35. Condition for ’’time" and. "Time". 
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2. Implementation 



Tne monitoring of user actions provides tne 
instruction and cnaracter sequence in tne same manner as 
done in tne context free mode. A consideration was given to 
require more information be provided by tne monitor, 
nowever, tne notion was discarded because it would reouire 
the user to be aware of tne functioning of tne preprocessor. 
Requiring tne user to provide information to tne system 
would betray our goal for tne system. Tne user snould only 
be required to initiate tne system and tnen perform editing 
as if tne system was not actively monitoring nis actions. We 
feel tne requirement of specifying wnetner tne user wants to 
perform context free or context sensitive operations is tne 
maximum tnat snould be asited. If it were feasible to 
recognize tne difference between tne two modes from tne 
user's actions alone, tnis limitation would be also removed. 

Given only tne instruction sequence, tne cnaracter 
sequence, and the information of a context sensitive 
environment, tne first assignment of tne context sensitive 
preprocessor is to discern tne string of cnaracters upon 
wnicn some operation is to be performed. Tnis is a pattern 
recognition problem of considerable difficulty. Angluin [19J 
provides tne following tneorem, "There is an effective 
procedure wnicn, wnen given a sample S as input , outputs a 
pattern p wnicn is descriptive of Si”. Tne sample S is a 
subset of tne set of all strings over tne alpnabet of tne 
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language. The affective procedure is computationally 
expensive and not implementationally desirable for our 
system. Tne procedure is an enumeration tecnnique on 
patterns with a lengtn less than tae snortest example in tne 
sample set S. Eacn of tne enumerated patterns is tested to 
determine if it is descriptive of tne entire set S. Tne 
longest pattern tnat is descriptive of S is tne most 
specific pattern for tne set. Clearly, as tne lengtn of tne 
of tne sample grows, tne number of enumerated patterns will 
grow exponentially. Angluin [19J states, "in tne general 
case, tne test performed on tne patterns is an NP-complete 
problem.". Tne test sne is referring to is tne cnecfc to see 
if the enumerated pattern is descriptive of S. 

For implementation purposes, we need a mec.nanism 
that falls well snort of tne exponential cenavior of tne 
effective procedure mentioned above. Tne text editing domain 
nas two types of instructions for tne purpose of this paper. 
The first type of instruction will be called cursor 
positioning instructions wnile tne second type will be 
called data manipulating instructions. Assuming tne text 
file is to be represented as a linear array, only one cursor 
position instruction need concern us. All cursor positioning 
commands sucn as move left, move up or move down can be 
represented as move rignt instructions. Data manipulation 
instructions operate on one character and do not reposition 
tne cursor. 
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Tne method we nave adopted for determining tne 
string of interest and tne context of tne string is cased on 
tne above definition of tne types of instructions available 
in tne text editing domain. Tne preprocessor scans tne 
instruction sequence loosing for an occurrence of a data 
manipulation instruction. Tne cnaracter associated witn tnis 
instruction is then tasen as tne first cnaracter of tne 
string of interest. Otner cnaracters are added to tne strins 
by continuing tne scan until multiple occurrences of cursor 
positioning instructions are encountered. A nvpotnesis is 
then constructed consisting of three parts. Tne first part 
is tne beginning context. It is constructed from tne 
characters that preceded the string in the cnaracter 
sequence. Tne second part is tne string itself and tne final 
part is the endine context constructed from tne characters 
seen after tne string. For engineering considerations, tne 
number of characters in the beeinnine and ending context 
will be limited to twenty characters. The probability of the 
context exceeding twenty cnaracters on both sides of tne 
string in the text editing domain is small enough to ignore. 

Once a hypothesis is proposed it is set aside as an 
active hypothesis and scanning of tne input continues. Otner 
cases of data manipulation instructions surrounded by cursor 
positioning instructions will result in otner nypotnesis 
being constructed. As these hypothesis are added to the 
active nypotnesis list tney are cheesed for consistency and 



if the new hypothesis causes conflicts they are resolved by 
constructing another nypotnesis from tne conflicting 
hypothesis. To demonstrate this mechanism we present an 
example which will illustrate tne generation of hypotheses 
and resolution into a condition function. The example used 
is the construction of the function which will recognize the 
string 'time'. 

Suppose the text file contained the following 1 
sentences somewhere in tne file. 

The time is two oclocK. 

It is time to go to ted. 

Time the runner. 

Did you run out of time? 

Mso, suppose tne user nas specified tne environment is to 
be context sensitive and has beeun to perform actions on tne 
file. The monitor could create tne following instruction and 
character sequence fragments from tne user moving tnrougn 
tne text file and capitalizing these occurrences of 'time'. 

( RRRRCRC RCRCRRRR ...) 

(Tne tTi ImMeE is . . . ) 

(RRRRRRCRCRCRCRRRR . ..) 

(It is tTilmMeE to ...) 

(RCRCRCRRRRR . ..) 

(Ti ImMeE tne . . . ) 

(... RRRRRRRRRRRCRCRCRCRR) 

(... run out of tTi ImMeE? ) 
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Tnis example is not to imply tne user must change all 
occurrences in tne text file but ne snould provide enougn 
examples from tne file to insure nis desires are understood. 
If tne user nas not supplied a dis ti ngui suing set of 
examples and an incorrect program is generated ne may add to 
tne set of examples. 

Scanning tne first instruction sequence until tne 
first data manipulation instruction results in tne string 
'time' being constructed. Tne resulting nypotnesis is tnat 
the string 'time' is witnin tne context of 'Tne<sp>' and 
'<sp> is two oclocfc.'. Tne nypotnesis may te viewed as tne 
following data structure. 

Hypotnesis l: 

Begin context: Tne<sp> 

String: time 

End context: <sp>is two ocIock. 

A second nypotnesis would be generated for tne next portion 
of tne instruction sequence as snown below. 

Hypotnesis 2: 

3egin context: It is<sp> 

String: time 

End context: <sp>to go to bed. 

A comparison of tnese two nypotneses indicates a 
disagreement between tne contexts. Tne conflict is resolved 
by determining tne longest beginning and ending context tnat 
agree between tne two nypotneses and generate a nypotnesis 
reflective of tnis agreement. By wording backward from tne 
last cnaracter in tne begin context for botn nypotneses, it 
is possible to ascertain tnat tne only cnaracter in 
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agreement is tne space. Wonting forward from tne first 
character in tne end context for both nypotneses , again only 
character in agreement is tne the space. A third hypothesis 
with tne new Degin and end contexts is generated as follows: 

Hypotnesis 3: 

Begin context: <sp> 

String: time 
End context: <sp> 

This hypothesis specifies tnat tne string 'time' 
must be preceded and followed by a space. Note tne test of 

the hypothesis implies tne user is allowed to specify one 

string during an example computation. It is also implied 
that there must be a begin and an end context for the 
string. Since it is possible to nave two hypotheses wnere 
one of the context strings do not agree in any of tne 

characters, a method must exist to provide tne appropriate 
context . 

Whenever tne comparison Detween context of two 

nypotneses results in the null string, a disjunction is 
formed from the characters immediately next to the string. 
For example, tne instruction sequence given aDove would give 
tne hypothesis: 

Hypothesis 4: 

Eegin context: Did you run out of<sp> 

String: time 
End context: V 

A comparison between hypothesis 3 and nypotnesis 4 
would result in tne null string for the end context. Since 
there must be an end context, the disjuction of <sp> and ? 
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is formed and tnis cecome tne end context for tne new 



nypotnesis. Generalization tecnniques tnat were mentioned in 
tne section on context free environment are tnen applied in 
an attempt to reduce tne end context to tne most general 
context consistent with tne data seen. Tne only alteration 
in tne generalization sen erne is tne lowering of tne 

tnreshold values for important sets. In tnis example, tne 
threshold value for the punctuation set would be lowered to 
1 and the ending context would become { x! x=space or x e 
{Punctuation}} . 

The final problem to be solved is tne recognition of 
variations in a string. Examples of variations of a string 
are, 'Time' and 'time', or 'enclosure' and 'inciosure'. As 
mentioned, if tne user intends to capitalize all occurrences 
of 'time', 'Time' is to be included. Note these variations 
of tne string become tne compound labels for tne arcs in 
Figure 35. The system includes a rule tnat enables the 
recognition of variations of strings provided tne user gives 
an example of the variation. The rule simply states tnat tne 
string length will be established to oe as long as tne 
longest string encountered during processing. Again, using 
the example, the hypothesis for 'Time the runner.' would be: 

Hypothesis 5: 

Begin context : ... T 
String: ime 

End context: <sp>tne runner. 

It has been established by preceding user actions 
tnat tne string length for tne hypothesis snould be 4. By 



matching the pattern in nypotnesis 5 witn tne string irom 
nypotnesis 4 it can De determined tnat tne string in 
Hypotnesis b snould be expanded by inserting a 'T' in front 
of tne string. Anotner nypotnesis is tnen generated wnere 
tne string will be tne disjuction between tne strings 'time' 
and 'Time'. Tne final nypotnesis from tne example would tnen 
oe: 

Hypotnesis b: 

Begin context: <sp> 

String: 'time' v 'Time' 

End context: t x! x = space or x e Pune.} 

Once tnis nypotnesis nas been generated, it is tnen 
used to examine tne input for negative examples tnat can 
strengtnen or weaken tne nypotnesis. Suppose tne input 
contained the fragment "... timely results..." . Processing 
tne input witn Hypotnesis 6 would snow a maten for tne 
string, but tne end context would not agree; tnerefore, tne 
nypotnesis will be strengthened by cnanging tne end context 
as snown below: 

Final Hypothesis: 

Begin context: <sp> 

String: 'time' or 'Time' 

End context: tx|x=space v 
x e Pune. 5. 
x e small letters) 

After the input has been processed and a final 
nypotnesis proposed, tne nypotnesis is used to construct a 
procedure suen as snown in Figure 35. Tne first part of tne 



procedure to be 


cons tructed 


is 


the 


transitions for tne 


beginning context. 


Tne states 


in 


tne 


procedure are tne 
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instructions in the instruction set, and the arc labels 
consist of tne information in tne final nypotnesis. A start 
state is placed in the procedure with an arc to a move right 
instruction (R). Since the procedure is a string match or 
ioott-anead routine all states other tnan tne start state 
will be move rieht instructions. Each of the states will 
have two arcs exiting them . Tne la be Is on these two arcs 
will be the negation of the each other. 

The construction is accomplished by placing tne 
first character of the beein context on the exiting arc 
going to a new move right state. Tne other arc is labeled 
with the negation of tne character and tnis arc terminates 
at the first move right state. Each character of the be/?in 
context creates another move right state labeled as 
mentioned. 

Tne string from tne nypotnesis is then used to 
complete the procedure that has been partially constructed. 
If the string is composed of disjunctions, tne cnaracters 
are used to form disjunctions. Each of the disjunctions are 
combined with conjunctions. The final nypotnesis above 
provides a string of 'time' or 'Time'. Tne conjunction of 
disjunctions will be formed as: 

('T' v 't') & ('1' v 'i') 4 ('m' v 'm') & ('e' v 'e') 

Upon reduction the string will be expressed as: 

('T' v 't') & '1' & 'm' & 'e' 

Each disjunction becomes a label on an arc to a new move 
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rignt state ana tne negation becomes tne la cel on an arc 
oack to tne original move rignt state. 

Finally, tne end context is alael in tne same manner 
as tne begin context. Tne first cnaracter Decomes tne label 
on tne last move rignt state created from tne string and new 
states are added for eacn cnaracter in tne end context. Tne 
result of tnese operations is displayed in Figure 35. 
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IV. CONCLUSIONS AND RECOMMENDATIONS 



A. SYNTHESIZER 

Tne syninesizer mat nas been implemented for tnis 
thesis will produce programs from example computations in a 
reasonaoie amount of time. Tne system response for most of 
tne traces was within 10 seconds on a Digital Equipment 
Corporation PDP-11/50 minicomputer. Tne response time is a 
function of tne lengtn of tne trace and tne numcer of 
multiple occurrences of a particular instruction or set of 
instructions in tne final algoritnm, with multiple 
occurrences of an instruction affecting response time the 
most. As Biermann [ 1 7 J nas noted, tnis nas a nice 
implication for programming by example because most 
algorithms do not exnibit tne characteristic of having a 
large number of instances of the same instruction. In other 
words, almost all multiple occurrences of an instruction in 
an input trace are indicative of a loop in the algoritnm. 

In all of the test cases except those tnat required a 
large amount of bacKups, static processing accounted for at 
least half of tne total response time. Future modifications 
to tne syntnesizer wnicn would decrease tne total response 
time could be directed toward designing the static 
processing stage more efficiently. However, tne trade-off 
between static processing and dynamic processing must be 
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Kept in perspective. Static processing is a linear function 
of the lensrtn. of the trace, whereas dynamic processing, 
since it is an enumerative searcn tecnnique, is an 
exponential function of tne lengtn of tne trace. 

Another area which should be considered is the dynamic 
processing stage. Tnere exists a plethora of research 
Questions within this area. The primary one being: Can more 
information De gleaned from tne input trace during static 
processing wnich will decrease the search time for dynamic 
processing? Difference sets and couple-classes provide some 
powerful mecnanisms for decreasing tne amount of searcn; 
however, lower bounds computations on the numoer of states 
required by tne macnine often increase tne amount of search. 
Lower bounds are restrictive in nature. They are designed to 
force the final algorithm into a minimum state configuration 
which, in many cases, causes extra searcn time. Relaxation 
of the lower bounds computation will result in a final 
algorithm wnicn may not be expressed in a minimum number of 
states, but which will still oe deterministic. Tnere mi?nt 
be better methods of initially computing the number of 
states which would result in a closer estimate of tne actual 
number of states required for tne algorithm. Obviously, the 
closer tne initial guess is to tne actual requirement, tne 
less backup incurred, and, therefore, the less search time 
requi red. 
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Since the amount of searcn required is governed by tne 
failure memory entries, tne more dense tne failure memory 
can be made, tne more directed tne searcn becomes. So 
anotner area for research is to determine if more 
information exists in tne failure memory entries tnan is 
currently being used. How mucn information do tne structure 
factor and tne free state factor provide? Is tnere anotner 
factor wnicn would be useful? 

Finally, a more general question can De addressed. Tbe 
underlying structure of tnis technique is an enumerative 
search. Can the tecnnique be generalized to include otner 
algorithms which are enumerative in nature? What 
modifications to the failure memory are needed? How would 
difference sets and couple -classes be redefined? 

3. CONDITION PROCESSING 

The condition processor front-end to tne synthesizer 
relieves tne user from worrying about some of tne control 
structure considerations by automatically generating 
conditions. Another addition which would increase tne power 
of the syntnesizer is an automatic loop variable generator 
as discussed by Biermann [18J . Although the text editing 
environment nas been used in tnis tnesis vorx, tne part of 
the condition processor design which deals with a context 
free environment is general enough that it could oe designed 
to operate in any domain. 
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Condition generation in a context sensitive environment 
is a .muon harder problem furtner complicated by requisite 
pattern matching ana pattern generation. Before tnis type of 
condition veneration can he generalized, mucn worm nas to ce 
done to increase the efficiency of pattern veneration 
scnemes. Angluin [19J nas shown a pattern generation scneme 
which is a polynomial time alvorithm for pattern veneration 
witn one variable, hut tne domain we nave examined will 
require at least two variables. There is not a polynomial 
time algorithm for pattern generation with two variables. 
Heuristic techniques will probably be necessary to provide 
methods of pattern generation which will be fast enouvh to 
be useful over a wide range of problems. 
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Tne command line needed to run tne syntnesizer is: 
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^define EOF '\0' 
^define EOLN '\n' 
#det‘ine MAXCNT ‘dm 
tfdeflne MAXVCTR did 
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FMfMAXCNTj [MAXVCTRJ J 
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printf ("TRACE TABLE \n\n"); 

printf ( LEVEL TRANSITION STATE \n\n )» 

f or( 1=1 ;i<triplecnt ; i++) 

printf ( %2d %c. %2A%c \n , 

i t TraceTa ble [i J . N , TraceTabie [ij .Select or, Trace Table [ij .0) ; 
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Determine a lower bound on tne total number of states required 
for the trace 

Determine a lower bound on tne number of states for each 
instruction 



Determine tne number of different instructions 
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elements in tne couple class 
t ri pies . member contains tne level number of 
eacn element witnin tne same class 
tri pies .class at tnis point in the program is 



used as a check to insure ali levels 
nave been compared 





c 












in 












QJ 












O 


4-> 










TJ 


W 












QJ 










M 


C 










P 












OJ 


Of 










E 


P 










aj 


♦-> 










—4 












aj 


V-« 












o 










TJ 












p~i 


id 










i-4 


o 










p 


03 












PL* 






















QJ 












p 


P4 


• 






• «* 


*-» 


OJ 


4-> 






n-l 




QJ 


QJ 










±1 


to 






**“3 II 


p 












jQ 


O 


QJ 






»•* 11 1 






C 






l/> + 


p 




P 






^ 1 + rH 


o 


QJ 


QJ 




• •* 


CS ft 




O 


P-4 




1-4 


QJ> — >i — » 1! 


<0 


C 


OJ 






E P-. U 


E 


QJ 


i-4 




II 


QJ 0J QJ cO 




P-4 


i-4 






-h d a »/) 


co 


OJ 


1-4 




t 


OJ E E (D 




V-4 






+ 


• dj di h 


a 


QJ 






+ 


— i E e a 


QJ 


U 


a; 




p 


r-j • • • 


E 




sz 




' — * " — * 


_i — ■ > ■ i 


QJ 


cO 






M> S5 


co E E *h 


—4 


CO 






OJ 


QJ ' ** ‘ «— • 


aj 


o 


P 




cn /\ 


H ^ lO 1/1 




u 


1-4 




i-4 


pu oj aj qj 


o 


c 






i-H • *> CO 


iH H —4 —4 






— H 




iH c 


^ ft ft ft 




03 


— < 




p p 


M iM iH i-4 






i-4 


— ■ 


♦ 11 OJ 


+ P-4 Pi P-4 


m> 


to 


Vh 


« 


— . E 


+ M> M> M> 


cO 


i-4 




p—4 


•*-3 » OJ 




P. 




o 


0-1 


» **“3 —4 




1-4 


P-4 




1—4 


cn t — - oj 




V-4 


4-> 




CP 


QJ P-4 • 






PL —1 


E-< 


—4 M> — < 




oj 


E 






a. p* •*-> 




P 


QJ 


OJ 


II 


r4 E 1 1 




eh 


E 


o 


II 


f-t QJ CO 












*■» E QJ 












• **. —4 * •» 










— - 


+ P* + 




* 






cm 


4- iH + 




\ 




* 


1-4 


-M -i iH P-» E 












M> 
























i-. 












i-4 



122 











P 


P 














P 


OJ 














QJ 


o 
















E 






m 








tH 


QJ 






QJ 








I 


E 






H 








fH 






4— X 


Pi 










QJ 




m 


fH 








m 






m 


p 








H 


QJ 




03 










ni 


P 




«— 4 










> 






CJ 


—4 








V OJ 


O 




• 


—4 








* ^H 


in 




— , 


Q> 










f“H 




»H 








• •k. 


p xJ 


QJ 




1 


p 






P 


QJ P 






to 


o 








m qj 


H> 




MM 


V* 






II 




in 




m 










QJ iH 


P 




h* qj 


QJ 






m 


o 


E 




*-C rH 


—4 






in 


P r-f 






Pi 


£3 QJ 






QJ 


CL QJ 


iH 




Si fH 


<"0 H 






—4 


P > 


1 




P 


4-> Pi 






o 


QJ QJ 






II ^ 


fH 






• 


=M r-H 




— - 


•M 


vn P 






— , 


V» 


*• 


Si 


II 


QJ 1-5 








fH p 


m 




• - m II 


-H 




Si 


Pi 


P O 


m 


ii 


— » in 


Pi P 






MM 


Vi 


03 


— 


E 03 in 


fH O 




(1 


P 


QJ 


-H 




— 1 h m 


P 4-> 




— • 


QJ 


p :> 


O 


. — i 


H O 03 


•-> QJ 






P 


h>P> 




E 


QJ • -H 


—4 




— t 


E 




QJ 


— 


m — i o 


O Ctf) 




Pi 


QJ 


P Vf 


—4 


*-> 


Vi *H • 


4^ P 




hm 


E 


fH O 


Pi 


^ QJ 


V. 1 — , 


P fH 




p 


• 




P 


v in 


•H rlH 


fH in 




QJ 


— -, 


>* P 


o 


V Vi 


P >— • 1 






-O 


P 


4-5 QJ 


o 


P Vf 


• m fH 


X 03 




E 


MM 


fH P 




• •k f-f 


, QJ * 


QJ 




QJ 


m 


> E 


QJ 


^ p 


p — i in 


p in 




E 


QJ 


fH QJ 


E 


^ p • 


P.Q, 


P QJ 




• 


—4 


^ E 


03 rH 


+ O -n 


in fH — i 


fH *-> 




— * 


Pi 


fH 


m 1 


+ QJ P 


d Pft 


03 




P 


fH 


m qj 


fH 


E — 4 * 


r-H 4-5 H 


m o 




MM 


P 


p 


QJ 


P* in 


Pi— P 


m *h 




m 


P 


<d in 


P P 


QC fH QJ 


H QJ p 


03 P 




QJ 


MM 


P fH 


O 


HPH •- 


P —4 


—4 P 




-H 


m 


*-» 


Vf 


O ^ ft P 


1-5 fH 


O fH 




Pi 


QJ 




P 


> V -H II 


II P 






fH 


—4 


p 


fH fc> 


XI P P fH 


ic i 


QJ (Si 


- — * 


P 


Pi • •* 


O r— 4 


Q 


*-5 




—4 


E 




fH + 


Vi QJ 


QJ 


2 : S3 — 




(X Cm 


II 


•w 


P + 


> 


P QJ 


V 11 Vf 




P O 


\y 


QJ 


P> Pi 


to QJ 


03 P 


E P -H 




o 




• —4 




C r-H 


H> 


• «k. 




O >» 


p 


Si fH 


+ 


QJ 


fH 


si p 




p 


— r 


11 P 


+ 


P Vi 


1 Vi 


II o 




p *-5 


QJ 




~h a 


Oh 


O 


E Vi 




OJ P 


• H 










' — 




♦-> QJ 


Si fH 










P 




* p Vk 


II p 






* * 


\ 


o 




< * 


p 






\\ 


a- 


Vi 





123 



wnile((p <= memptr [i-1 J ) 

&& ( tri pies t i — 1 J .dit'i'SPt IpJ 1= ic— 1 ) ) 

p++; 



if (trlplesli-lj .dift'set IpJ == NULL) 

triples[ — ij .diffset [memptr[ij ++J = — lr? 

else 
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Static information gives tne capability to compute lower bounds 
on the number of states needed for each instruction, and by a 
summation of those bounds, a lower bound on the total number of 
states needed in tne program in order to construct a deterministic 
machine from the program trace. 
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/* As eacn element of tne difference set is checked, if 
the element is in the same couple-class as any of tne 
previous elements tnen tne instructions are tne same. 





4-3 








■G 




o 




G 








G 




p 




0) 








P 




Oj 




E 








O 




M 




0J 








ft 








-H 






<D 






/\ 




CU 






g 


Pi 




G 










4-3 


CU 




•rH 1/1 




4-3 








> 




ft «H 




<0 


— - 




4-3 


O 




C 




G 


ft 




G 


H 




QJ ft 


Cu 


4-3 


ft 


ti 


G 






E 03 


g 




ZD 


II ^ 


O 


03 




CD G 


f-4 


Pi 


z 


in 


o 




C 


P ft 


P 


O 




in cn 




in 


O 


O 


0) 


ft it 


it 


i/1 03 


i>* -r-l 


•H 


G G J 



— 03 i—4 



G 


G 
















— 1 o 






ft p> 


U 




O 






CU 


O 




in 










m 


o • 






E 03 


G 


>* H> 








-r-l 




m 










in 


* — '1 






•rH JZ 


P 


P 


in 






>> 




03 






• 




03 


— 1 — 1 






tn £~» 






ft 






ft 


03 




H 










H 


— » E 


> — - 






in 


x3 


E 








O 




— O 






— ' + 




O 


id 1 


• rH 






c 


QJ 


QJ 






-G 


o 




+ • 






+ + 




• 


i QJ 


rH 1 




G • 


•rH 


H> 


4-3 






<U 


rH 




+ 






+ E 




i 


QJ P 


1 


• *s 


QJ 




G 








ft 






id —i 






id 




< — i 


P o 


tt 




G QJ 




G 


G 






G 


a> 




••* id 


»•* 




»•* rH 




id 


O ft 


it li 


rH 


in m 


03 


O 


•rH 






G 


Pi 




S'— 


rH 




rH 1 




— * 


ft m 




1 


tH 


G 


a 








O 


O 




^ cu in 


1 




i t< 


r — *■ 


cu 


m ft 






-H H> 






>- 






O 


ft 




-4 it Pi in 






it 


Si 


p 


ft E 


E E 


il 


O 03 






P 






in 


cn 




PI O ro 


it 




li — , 




o 


E <U 


— * + 


tl 


03 G 


P 


•rH 








•rH 


ft 




£G p> — 1 






II E 


It 


ft 


QJ ft 


<U + 




P> H> 


o 




G 






T3 


E 




z — , m o 


, 


- — ■ 


— -v— -• 


•— 


in 


ft ' • 


P » — 1 


— H 


in 


+H 


in 


CU 




r— s 




a> 




id ft • 


id 


Si 


id cu 




ft 


— * m 


O QJ 


G 


QJ G 




QJ 






Si 


in 


f-» 




It ' — • £ — . 


> — * 




« — ' p 




E 


m qj 


ft P 


+ 


tH 


G 


c 


G 






•rH 






(U <u -H 


QJ 


it 


QJ O 


E 


aj 


QJ H 


m o 


+ 


G 


QJ 


G 


CJ 




II 




QJ 




p 4-3 — • 


P 


— 


P ft 


\ i 


ft 


— H ft ft ft 


— * 


qj m 


P 


QJ 


03 




— • 


ft 


G 




m o — * m 


O 




o m 


(U 


' ‘ 


ft **H 


E in 


CU 


QJ cu 


•H 


P 


CU 






G 


ft 




in ft <n qj 


ft 


— 1 


p> ft p 


n 


•rH P 


QJ ft 


P 


o u 


G 


G 






— t 


<U 






03 lG QJ rH 


in 


G 


in E 


o 


QJ 


P ft 


p> E 


o 


G 


G 1 


a 


p 




Jd 


E 


G 




-H ft — H ft 


ft 


— ' 


ft QJ 


p 


-H 


ft 


a> 


p> 


in QJ 


CU 


o 


o 




— * 


cu 


■H 




O E ft**~» 


E 


QJ 


E H> 


m 


ft 


- 


ft 


in 


03 p 


P 


o 


ft 




QJ 


— J 






• QJ *H P 


QJ 


P 


QJ 


ft 


•rH 


Vh 





ft 


G G 










P 


cu 


(U 




, ft P ft 


ft 


O 


P> rH 


E 


P 


ft 


CU 


E 


o 


m 


p 


Pi 




O 




G 




ft 




ft 


+ 


QJ 


ft 




*H 


QJ 


H> O 


QJ 


o 


• 




4-3 


ft 


H 




* — •Si'-' 




in 


G id 


ft 


■ 




iH 




QJ O 


4-3 




G 




m 


03 


03 




in it p 




ft 


It It 


> — ' 


4-1 




G 


■ — ' 


in 


03 


p 


O 




ft 


G 


> 




cu id 1-4 




E 


id E 


<U 


•H 




> 


QJ 


V-i 


«-> 


QJ 


-rH 




E 


ft 






H ^ 




(U 


■WW 


-H 








H 


QJ O 


m 


ft 


4-3 




QJ 




rH 




ft P 




ft 


p p 


iH 








•rH 


G 




E 


o 




4-3 


o 


1 




■rH O 




<W 


O O 


a 








G 


P> P 


cu 


G 


G 




■ ' 


in 






P p4 




QJ 


V* 5m 


> 


-ft 




-H 


* 


QJ 


G 


G 


P 




CU 








ft 




• rH 














QJ G 






4-3 


• 


—4 








' 




Si «rH 














e E 




(U 


m 


Si **h 








Vh 




tl G 














^ G G 


G 


G 


G \ 


tt 


G 






* 


•rH 




G 3 


-ft 












\ O G 


o 


6h 


►H «■ 


id 


3 



126 



/* 'save' compares the value computed at each level for 
a particular Instruction and saves the maximum value 
for the trace 
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/* Establish the TraceTabie wnicn will be used in tne Dynamic 
processine of tne instruction trace. Since it is necessary 
to map tne triples information from a three column structure 
to a two column structure, some offset must occur 
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/* Tne dynamic processing Involves assigning values to ine selector 
variables. This is done in conjunction with Failure Memory entries 
providing the pruning of tne search tree.' 
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in tne trace, it is an old state. 

This must always be the case wltn a forced 
assignment . 



TraceTa Die [vfcivlj .State = OLD? 
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TraceTa Die[wKivlJ .State = OLD > 
1 = wkivi; 
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Fixup(wtivi, Trace Table [wkivlj .Selector); 



Fixup attempts to correct the contradiction t>y incrementing 
the Selector to n+1 
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Difference Vector resolution prevents future assignments being 
made tnat are snown to be illegal from tne static processing 
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Fi xup(wKlvl,TraceT able [wKiviJ. Selector); 



/* If free states are available, use one of tnem and continue 
with the resolution. 
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whi 1 e ( T ra ceTa ble l nj .Dif f Se t [ i J != NULL) 
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return (0 ) » 
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if (TraceTablefrowJ .Class != NULL) 
f or( i=row-l ; i > 0 ; i — ) 

i f (TraceTa ble [row] .Class == TraceTa Die [i J .Class ) 
if (col == TraceTa ble ti 1 .Selector ) 
if (FM( row-1 J I TraceTa ble Li -lj. Selector]. W - = 
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/* AddState Increases tne bound on tne number of states allowed 
for a particular instruction when there are free states 
available to the machine and the Instruction at the current 
level needs an extra state in order for tne machine to remain 
deterministic 
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debug level for the program 



+ 

+ 



\ 

ll 

\/ 



f 0 




— - 






t/> 


<D 












-r-l 




^3 •- 


> 




\ 


O 




v <S> 






SI ' 


P« 




\ 1 








O 




II -H 






/\ t-* 






if) 






— i + 


(LI 




-r-l C 


if) 




1 






1/) SJ ^ 






• •> H CJ 


</> 




a s> w- 


t-I 




- II 11 c 






tH «r-l C* P-» 


1— 1 


— — * 




O 


. i 


PS3 4-» 


E-t 


If) 4/) 


C II O (L 


*t2 


’W' 


•r-l C Vi P. 




•r-l *h 






O 1 0 




X 


-C 






(U o 





143 



II 



0 - 



o V 3 V WTVMrvc 

\j z ri i-i r 



1. Iverson, K., "Operators", AP^ transactions on Fro^r-?^- 
mir.g Languages goo S^tens Vol. 1, No. 2, Goto ter 1979. 



2. Dij‘<stra, 2. V., A D i s o i n M n e of Prop-ram^ir? . prentice 
Hall Inc., 1976. 



3. Green, m C., "The Design of the PHI Program Synthesis 
System", Proceed i ngc Second I n terna r. i ora I Onfprero^ 
or. Software Engineering ?. 4-18, OctoOer 1976. 



4. Glass, a. E. , Con-puting Pro je cts Whi~n Failed . 
Computing World, 1977. 



b. Heidorn, G. E., "Automatic Programming Througn Natural 
Language Dialog: A Survey", TPM J ? a s~ Dev 20 . 336, 

July 1976 



6. Biermann, A. W., M T.ura J languag e Pmcossirg , paper in 

preparation. Naval Postgraduate School, 1981 . 



7 . W a 1 X e r , D . E . , Hrde"^ tanning Fly ok- a n T.a r.gnagp , Elsevier 
No rth-Hol 1 and Inc., 1976. 



8. Smith, .B. R. , "a Design for an Automatic Programming 

System", Proceedings of the. 7th International Joint 

CQO.f £.re r.cs on Artificial In t e i Llagjmg , Vancouver, 

B . C . , Canada , 1 961 . 



9. lamia, Z. and Walainger, R. , "a Deductive Approach to 
Program Synthesis," AC VI Tra nsact.i ere on Programm rg 
Languages and Systems . Vol. 2, No. 1, F. 90-120, 
January 1930. 



10. Manna, Z. ^and Waldir.ger, R. , "Synthesis: Dreams ====> 
Programs," IEEE Transactions or Sm't^rp ">• . 

Vol. SE-5, No. 4, July 1979. 



144 



11. Biermann, A. s'., "Approacnes 10 Automatic Programming', 
Advances in Computer fierce . Vol. Id, P. 1-63, 1976. 



IE. Smitn, E. H. , "a Survey of tne Synthesis of LISP 

Programs from Examples , Prnc°en n?s mi lH£ 

In tarr.j ti nna 1 '.vorXsnoy on Program Co ns true t 1 on , 
Bonas, France, 1980. 



13. Summers, P. D., ”.A P'et&oaoiogy for LIS? Program 

Construction from Examples”, JACM BA . P. lbl-175, 
1977 . 



14. Biemann, A. '// . , " Tne Inference of Regular LISP Programs 

from Examples”, IEEE Transactions on Systems, van, m 

Cy harriers cs f Vol. SMC-e(fc), F. 58 5-bet*, August 1978. 



15. Cold, E. M . , "Language Identification in tne Limit”, 
Information and Control . Vol. 10, ?. 447-474, 1957. 



1 ^ 

1 m 



Biermann, A.'.v’. and Kri snnaswamy , R., 
Programs from Example Computations," 
Software Engineering . Vol. SE-E, No. 
Septemter 1976. 



Construe ting 
IEEE Transactions on 
3, ?. 143-153, 



17. Biermann, A.W., Baum, R. I. and Petry, F. tt E., "Speeding 
up tne Syntnesis of Programs from Traces," I EKE 
Transactions onCom outers . Vol. C-E4, No. 3, ?. 1 EE-156, 
P. 1EE-135, Feoruarv 1975. 



Biermann, A. if., "Automatic Insertion of Indexing 
Instructions in Program Syntnesis", In tern at i qjlo. \ 

J o u r r a J of Computer arc In form a ti on Sciences , Vol. 7 , 

No. 1, Marcn 1978. 



19. . Angluin A D., "Finding Patterns Common to a Set of 
Strings , Computer Systems Journal . Vol. El, A 1, 
August 1960. 



145 



BIBLIOGRAPHY 



Bibelt W.j "Syntax-Directed, Semantics-Supported Program 

Synthesis ’ , Artif 1 elaJ Intelligence . Vol. 14, P. 243-261, 

1980. 



Follet, <t R. , "Synthesising Recursive Functions with Side 
Jiff pets " . Artificial Intelligence . Vol. 13, P. 175-200, 
1980. 



Hewitt, C.E. and Smith, B., "Toward a Programming Apprentice, 
IEEE Transactions on Software Engineering , Vol. SE-1, No. 1, 
P. 26-45, March 1975. 



146 



INITIAL DISTRIBUTION LIST 



No. Copies 

1. Defense Tecnnical Information Center 2 

Cameron Station 

Alexandria, Virginia 22314 

2. Library, Code 0142 2 

Naval Postgraduate Scnool 

Monterey, California 93940 

3. Department Chairman, Code 32 £7 1 

Department of Computer Science 

Naval Postgraduate School 
Monterey, California 93940 

4. Professor Douglas R. Smith, Code 52SC 1 

Department of Computer Science 

Naval Postgraduate Scnool 
Monterey, California 93940 

5. Captain C. W. Miller, r JSMC 1 

109 Arterdurn Rd . 

Louisville, Kentucky 40222 

6. Captain J. S. Lape, 'JSMC 1 

6207 Doncaster Court 

Springfield, Virginia 22130 



147 




Thesis « ^ ^ _ 

M5857^ Miller J 93232 

c.l Condition recogni- 

tion for a program 
synthesizer. 



thesM58574 

Condition recognition for a program synt 




Ml null'll I 

3 2768 001 88369 7 
DUDLEY KNOX LIBRARY 



